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Abstract 

There  is  a  rapidly  growing  interest  in  problem-scale  parallelism, 
both  as  a  model  of  animal  brains  and  as  a  paradigm  for  VLSI.  Work 
at  Rochester  has  concentrated  on  conneetionist  models  and  their 
application  to  vision.  This  paper  lays  out  a  framework  for  dealing 
with  such  problems.  The  framework  is  built  around  computational 
modules,  the  simplest  of  which  are  termed  p- units.  We  develop  their 
properties  and  show  how  they  can  be  applied  to  a  variety  of 
problems. 

To  show  how  the  framework  can  be  applied  to  computational 
problems  in  vision,  two  specific  examples  are  developed  in  some 
detail.  In  the  first,  we  describe  how  spatially  distributed  data  can  be 
associated  with  a  complex  concept.  In  the  second,  we  discuss  the 
shape  from  shading  problem  and  show  how  a  global  parameter,  such 
as  light  source  position,  interacts  with  the  calculation  of  a  spatially 
distributed  parameter  such  as  surface  orientation, 

(  It  i 

I  lie  preparaunn  ot  dm  paper  was  supported  tn  part  by  die  I )e f e nse_,A d.Vii ne tid 
Research  Projects  Agency,  monitored  by  die  QNR.  under  Contracts  N^UlU-VH- 
(’-HIM .Mm*  NOPO 1  l?7 


*-/S  u  2>KG 


Table  of  Contents 


1.  Introduction 

2  Neuron-1. ike  Computing  Units 
definitions 

-  p-uniis 

-  lateral  inhibition 

-  q-units  and  compound  units 

-  units  employing  p  and  q 

-  disjunctive  firing  conditions 

-  conjuncuve  connections 

-  change 

3.  Networks  of  Units 

-  winner-lake-all  (W'l'A)  networks 

-  the  question  of  delicacy 

-  stable  coalitions 

-  an  artificial  example 

4.  Distributed  (Massively  Connected)  Computing 

-  using  a  unit  to  represent  a  value 

-  modifiers  and  mappings 

-  ume  and  sequence 

-  conserving  connections 

-  fixed  resolution  compulation 
low  resoluuon  grain 

5.  Some  Implications  for  Iiarly  Visual  Processing 

-  objects 

-  tuning 

-  sequencing 

*  motor  control 

-  shape  from  shading 

-  implementauon  in  networks 

-  other  networks 
conclusions 

Appendix:  Summary  of  Definitions  and  Notation 
References 


Accession  For 

NTIS  GRA&I  rpf 

DTIC  TAB  q 

Unannounced  q 

_ _ 

distribution/ 

Availability  Codes 
(Avail  and/or 
Dist  |  Special 


1.  Introduction 


Animal  brains  do  not  compute  like  a  serial  computer.  Comparauvely  slow  (millisecond)  neural 
computing  elements  with  complex,  parallel  connections  form  a  structure  wlucli  is  dramatically 
different  from  a  high-speed,  predominantly  serial  machine.  Much  of  current  research  in  the 
neurosciences  is  concerned  wuh  tracing  out  these  connections  and  with  discovering  how  they 
transfer  information.  One  purpose  of  this  paper  is  10  suggesi  how  connecuotust  theories  of  the  brain 
can  be  used  to  produce  testable,  detailed  models  of  interesung  behaviors. 

Artificial  intelligence  and  articulating  cognitive  sciences  have  made  great  progress  by  employing 
models  based  on  conventional  digital  computers  as  theories  of  intelligent  behavior.  Hut  a  number  of 
crucial  phenomena  such  as  associauve  memory,  priming,  perceptual  rivalry,  and  the  remarkable 
recovery  ability  of  animals  have  not  yielded  to  this  treatment.  Hie  other  major  goal  of  this  paper  is 
to  lay  a  foundation  for  the  systematic  use  of  massively  parallel  conneeuomst  models  in  the  cogmuve 
sciences,  even  where  these  are  not  yet  reducible  to  physiology. 

The  connecuonist  view  of  brain  and  behavior  is  that  all  encodings  of  importance  in  the  brain 
are  in  terms  of  the  relative  strengths  of  synapuc  connecuons.  The  fundamental  premise  of 
conneciionism  is  that  individual  neurons  do  not  transmit  large  amounts  of  symbolic  information. 
Instead  they  compute  by  being  appropriately  connected  to  large  numbers  of  similar  units.  This  is  in 
sharp  contrast  to  die  conventional  computer  model  of  intelligence  prevalent  in  computer  science  and 
cognitive  psychology.  While  the  connecuonist  view  has  a  much  stronger  physiological  foundation, 
explicit  models  of  behavior  have  been  almost  exclusively  east  in  the  framework  of  eomputer-like 
information  processing  models.  Connecuomsm  has  been  associated  with  a  pre-computational  view 
dial  knowing  die  eonnecuon  structure  of  a  system  is  all  that  is  required  for  us  understanding. 
Recent  advances  in  digital  hardware,  vision  research,  and  the  theory  of  eompmauon  have  caused 
renewed  interest  in  highly  parallel  computational  models  more  in  keeping  with  die  connecuonist 
paradigm.  It  now  appears  to  be  feasible  to  construct  models  which  are  simultaneously  structurally 
and  funcuonally  sound. 

The  fundamental  distinction  between  the  conventional  and  connections  computing  models  can 
be  grasped  in  the  following  example.  When  we  see  an  apple  and  say  the  phrase  "wormy  apple.” 
some  informauon  must  be  transferred,  however  indirectly,  from  the  visual  system  to  the  speech 
system.  Hither  a  sequence  of  special  symbols  that  denote  a  wormy  apple  is  transmuted  to  the  speech 
system,  or  there  are  special  connections  to  the  speech  command  area  for  die  words,  figure  1  is  a 
graphic  presentation  of  the  two  alternauvcs.  Ihe  path  on  the  right  described  by  double-lined  arrows 
depicts  the  siuiauon  (as  in  a  computer)  where  the  informauon  that  a  wormy  apple  has  been  seen  is 
encoded  by  die  visual  system  and  sent  as  an  abstract  message  (perhaps  frequency-coded)  to  a 
general  receiver  in  die  speech  system  which  decodes  the  message  and  initiates  the  appropriate 
speech  act.  We  have  not  encountered  anyone  who  will  defend  this  model  as  biologically  plausible. 

figure  1:  Conneciionism  vs.  Symbolic  fncodtng. 

The  only  allernauve  that  we  have  been  able  to  uncover  is  described  by  the  path  with  single- 
width  arrows.  This  suggests  that  there  are  (indirect)  links  from  die  units  (cells,  columns,  centers,  or 
whal-have-you)  that  recognize  an  apple  to  some  units  responsible  For  speaking  die  word.  The 
connecuonist  model  requires  only  very  simple  messages  (e.g.  stimulus  strength)  id  cross  a  channel 
but  puts  strong  demands  on  die  availability  of  the  right  connections. 

Over  the  past  few  years,  we  have  been  exploring  the  efficacy  of  fornmladng  detailed  models  of 
intelligent  behavior  directly  in  connecuonist  terms.  This  kind  of  effort  is  in  the  tradiuon  of 
McCullogh-l’ius  machines  and  Perceptions  and  has  long  been  viewed  as  a  good  way  of  attacking 
problems  in  low-level  vision.  Unul  recently,  work  in  dus  mode  has  been  mainly  just  suggesuve: 
examining  properties  of  networks,  attempting  to  match  wave-forms,  etc.  There  was  little  of  die 
detailed  specification  of  non-Uivial  behavioral  models  which  characterizes  Ai  and  cognitive 
psychology.  Currently,  a  great  deal  of  successful  vision  work  in  this  laboratory  and  elsewhere  has  its 
basis  in  highly  parallel  models  |IIanson  and  Riseman,  1978).  One  parucularly  fruitful  insight  for  us 
has  been  the  correspondence  between  the  so-called  Hough  techniques  [Rallard.  1981a|  and 


connecuonist  models.  We  are  conunuing  to  work  on  detailed  parallel  models  of  visual  funcUons 
|Hallard,  1981b;  Sabbah,  1981;  Mallard  and  Sabbah,  198!|  and  some  examples  will  be  used  as 
illusirauons  in  this  paper. 

Hut  tire  connecuonist  dogma  suggests  that  all  mental  ftinetions.  not  just  low  level  vision,  can  be 
well  described  in  terms  of  richly  connected  networks  iransmiiung  very  simple  signals.  We  have  done 
some  preliminary  work  |l  'eldinan,  1980;  198 1 1  on  laying  out  the  advantages  and  difficulties  in  such 
an  approach.  The  purpose  of  this  paper  is  to  prepare  a  solid  foundation  for  die  construction  of 
detailed  connecuonist  models.  This  involves  defining  a  set  of  primiuve  units,  considering  some  of 
their  properties,  and  using  these  to  solve  some  problems  that  seem  to  be  prerequisite  to  any 
widespread  use  of  connecuonist  models. 

The  body  of  tins  paper  has  four  sections.  Section  Two  contains  the  basic  defimuons  for  a 
tractable  and  biologically  plausible  neuron-level  compudng  unit.  Although  there  is  a  rich  tradiUon 
of  neural  modeling  research,  much  of  which  will  be  useful  to  us,  our  definitions  depart  from 
standard  ones.  A  primitive  unit  can  have  both  symbolic  and  numerical  state,  can  treat  us  inputs 
non  uniformly,  and  need  not  compute  a  linear  ftincuon,  A  particularly  important  construct  is  the 
use  of  groups  or  "conjuncUuns"  of  input  connections.  Some  important  special  cases  and  some 
simple  examples,  based  on  lateral  mhibiuon,  are  presented.  Kncapsulaiion  techniques  are  suggested 
as  a  basis  for  simplifying  larger  problems. 

Section  Three  is  concerned  with  the  general  comptiung  abilities  of  networks  of  our  units.  The 
crucial  point  is  achieving  a  single  coherent  acuon  in  a  diffuse  set  of  units.  Winner-take-all  (WTA) 
networks  are  introduced  as  our  solution  to  this  problem  for  single  layers.  More  generally,  we  define 
and  study  the  idea  of  a  stable  coalition  of  units  whose  mutual  reinforcement  has  the  effect  of  a 
single  acuon,  percepuon,  etc. 

Sccuon  Four  concentrates  on  some  specific  computations  and  how  they  can  be  effectively 
performed  within  die  model.  We  begin  with  computing  simple  functions  like  multiplication  and 
show  how  general  parameters  can  be  treated.  Modi  tiers  and  mappings  are  used  to  show  how 
connections  can  effecuvely  be  treated  as  dynamic.  An  extension  of  (ins  idea  allows  us  to  treat  time- 
varying  data  like  speech. 

In  Section  Five  we  tackle  some  additional  classic  problems  for  connecliomsm  and  apply  our 
ideas  to  some  more  problems  in  visual  perception.  A  representation  for  conjuncUve  concepts  such  as 
"lug  blue  cube"  is  laid  out  and  applied  to  the  descripuon  of  complex  objects.  Finally,  as  another 
indication  of  die  way  we  intend  to  proceed,  a  fairly  detailed  connecuonist  model  of  shape-from- 
shndmg  computations  is  presented. 


2.  Neuron- Like  Computing  Units 
Definitions 

As  part  of  our  effort  to  develop  a  generally  useful  framework  for  connccUomsl  theories,  we 
have  developed  a  standard  model  of  the  individual  unit.  It  will  turn  out  that  a  "unit"  may  be  used 
ur  model  anything  from  a  small  part  of  a  neuron  to  the  external  funcuonality  of  a  major  subsystem. 
Hut  die  basic  notion  of  unit  is  meant  to  loosely  correspond  to  an  information  processing  model  of 
our  current  undersUinding  of  neurons.  The  particular  definitions  here  were  chosen  to  make  it  easy 
to  specify  deUuled  examples  of  relatively  complex  behaviors.  1'here  is  no  attempt  to  be  minimal  or 
mathematically  elegant.  The  various  numerical  values  appearing  in  the  definitions  are  arbitrary,  but 
fixed  finite  bounds  play  a  crucial  role  in  the  development.  'The  presentation  of  die  definitions  will 
be  in  stages,  accompanied  by  examples.  A  compact  technical  specification  for  reference  purposes  is 
included  as  Appendix  A. 

Iiach  unit  is  a  computational  enuty  comprising 
|q)  --  a  set  of  discrete  states,  <  10 

p  --  a  conunuous  value  in  [-2,1|,  called  potential  (accuracy  of  10  digits) 
v  --  an  output  value,  integers  0  <  v  <  9 

i  --  a  vector  of  inputs  ij . in 

and  functions  from  old  to  new  values  of  these 
p  <-  fli.p.q) 
q  <-  g(i.p.q) 

v  <-  h(i.p.q) 

which  we  assume,  for  now,  to  compute  continuously.  'Ihe  form  of  the  f,  g,  and  h  funcuons  will 
vary,  but  will  generally  be  restricted  to  conditionals  and  functions  found  on  hand  calculators.  There 
are  bodr  biological  and  computational  reasons  for  allowing  units  to  respond  (Tor  example) 
logarithmically  to  their  inputs.  The  notation  is  borrowed  from  die  assignment  statement  of 
programming  languages.  This  notauon  covers  both  conunuous  and  discrete  time  formulations  and 
allows  us  to  calk  about  some  issues  without  any  explicit  menuon  oT  umc.  Of  course,  certain  other 
questions  will  inherently  involve  time  and  computer  simulation  of  any  network  of  units  will  raise 
delicate  questions  of  discreuzmg  ume. 

r-Units 

Tor  some  applicauons,  we  will  be  able  to  use  a  parucularly  simple  kind  of  unit  whose  output  v 
is  proporuoual  to  Us  potenual  p  (rounded)  and  which  has  only  one  stale.  In  other  words 

p  <-  p  -t-  /)  2ik  j-1  <  p  <  1| 

v  =  «p  -  0  jv  =  0...9| 

where  /),  a,  0  are  constants 

The  p-unit  is  somewhat  like  classical  linear  threshold  elements  (Perceptions  |Minsky  and 
Papert,  1972|),  but  there  are  several  differences.  The  potenual,  p,  is  a  crude  form  of  memory  and  is 
an  abstracUon  of  the  instantaneous  membrane  potenual  that  characterizes  neurons. 

The  restriction  that  output  take  on  small  integer  values  is  central  to  our  enterprise.  The  firing 
frequencies  of  neurons  range  from  a  few  to  a  few  hundred  impulses  per  second.  In  the  1/10  second 
needed  for  basic  mental  events,  diere  can  only  be  a  limited  amount  of  mformauon  encoded  in 
frequencies.  The  ten  output  values  are  an  attempt  to  capture  this  idea.  A  more  accurate  rendering 
of  neural  events  would  be  to  allow  100  discrete  values  with  noise  on  transmission  (c f.  (Sejnowski, 
1977|).  If  u  turns  out  that  local  "graded"  potenuals  cannot  be  efi’e cuvely  quantized,  the  defimuons 
will  have  to  be  extended  to  allow  local  exchange  of  continuous  information.  Transmission  time  is 
assumed  to  be  negligible;  delay  units  can  be  added  when  transit  time  needs  to  be  taken  into 


account. 


Kxample  1 

One  problem  with  the  definition  above  of  a  p  unit  is  that  us  potential  does  not  decay  in  the 
absence  of  input.  This  decay  is  both  a  physical  property  of  neurons  and  an  important  computational 
feature  for  our  highly  parallel  models.  One  computauonal  trick  to  solve  this  is  to  have  an  inhibitory 
connection  from  the  unit  back  to  itself. 

Figure  2:  Self- Inhibition  and  Decay. 

We  will  follow  the  usual  notation  that  a  connecuon  with  a  circular  tip  is  inhibitory.  More 
complex  networks  will  sometimes  be  specified  by  a  connecuon  table  instead  of  a  diagram. 
Informally,  we  utenufy  the  negative  self  feedback  with  an  exponential  decay  in  potential  which  is 
mathematically  equivalent.  We  will  specify  this  more  carefully  below  and  add  the  notion  of  weights 
on  inputs. 

The  first  step  is  to  elaborate  the  input  vector  i  in  terms  of  received  values,  weights,  and 
modifiers: 

Vj,  ij  =  rj  -  Wj  •  nij  j  =  1 . n 

where  rj  is  lire  value  received  from  a  predecessor  |t  =  0...n|;  Wj  is  a  changeable  weight,  unsigned  |0 
<  Wj  <  1|  (accuracy  of  10  digits);  and  m:  is  a  synapio-synapuc  modifier  which  is  either  0  or  !. 

The  weighcs  are  the  only  thing  in  the  system  which  can  change  with  experience.  They  are 
unsigned  because  we  do  not  want  a  connecuon  to  change  sign.  The  modifier  or  gate  greatly 
simplifies  many  of  our  detailed  models  in  Secuon  4.  One  could,  of  course,  use  extra  units  instead, 
but  the  biological  evidence  for  blocking  inhibition  is  solid. 

Fatcral  luliihitinu.  Several  Cases 

Mutual  lateral  inhibiuon  is  widespread  in  nature  and  has  been  one  of  the  basic  computauonal 
schemes  used  in  modeling.  We  will  present  two  examples  of  how  it  works  to  help  aid  in  intuition  as 
well  as  to  illustrate  the  notation.  The  basic  situation  is  symmetric  configurations  of  p-umts  which 
mutually  inhibit  one  another.  lime  is  broken  into  discrete  intervals  for  these  examples.  The 
examples  are  too  simple  to  be  realistic,  but  do  contain  ideas  which  we  will  employ  repeatedly. 

Example  2:  Two  I’-Units  Symmetrically  Connected 

Suppose  v  =  lOp,  w^  =  .1.  W2  =  .05(-).  It  is  easier  to  use  P  =  lOp  internally  and  round 
output: 

P(t+1)  =  P(l)  +  rj  -  (,5)r2  Tj  =  received 

v  =  round  (P)  |0...9| 

Referring  to  Figure  3,  suppose  the  initial  input  to  the  unit  A,1  is  6,  then  2  per  time  step,  and  the 
initial  input  to  li.l  is  5,  then  2  per  ume  step. 

Figure  3:  Two  P- Units  Symmetrically  Connected,  with  Table. 

This  system  will  stabilize  to  the  side  of  the  larger  of  two  instantaneous  inputs. 

It  is  interesting  to  also  look  at  a  continuous  version  of  Lins  example.  The  conunuous 
approximauon  to  the  defining  equations  for  Hxample  2  can  be  written: 

»”!  =  2  -  .5I>2 
I"  2  =  2  -  .5 1*! 
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where  Pj  is  ten  umes  the  potential  of  A  and  P2  of  H  and  where  I”  j  is  the  derivative  of  with 

respect  to  ume.  This  system  of  linear  differential  equations  can  be  solved  by  standard  techniques  for 
llie  initial  conditions  1*  j  =  6,  P->  =  5.  The  solutions  are 

P,  =  4  +  l/2e1/2t  -  3/2e‘1/2t 

P2  =  4  -  l/2e1/2t  -  3/ie'1/2t 

First  note  that  the  last  term  in  each  equation  is  a  negative  exponential  and  can  be  neglected. 
The  resulting  relation  indicates  clearly  the  rapid  decay  of  P2  and  rise  of  Pj.  Linear  systems  theory 
is  only  an  approximation  to  our  models  which  in  general  are  nonlinear.  For  example,  the  above 
equations  do  not  take  into  account  die  fact  that  the  potenuals  saturate.  Nonetheless,  the  theory  can 
be  an  important  aid  in  understanding  some  properties  of  our  networks. 

Lxaiii(ile  3:  Two  Symmetric  Coalitions  of  2* Units 

v  =  lOp 

Wj  =  .1 

W2  =  .05 
w3  =  .05(  - ) 

P(l+1)  =  P(t)  +  rl  +  ,5r2  -  ,5r3 
v  =  round(P) 

A,C  start  at  6;  H,D  at  5; 

A,H,C,I)  have  no  external  input  for  t>l 

Figure  4:  Two  Symmetric  Coalitions  of  2-Units,  with  Table. 

This  system  converges  faster  than  the  previous  example.  The  idea  here  is  that  units  A  and  C 
form  a  "coalition"  with  mutually  reinforcing  connections.  The  competing  units'  are  A  vs.  H  and  C 
vs.  D.  F.xample  3  is  die  smallest  network  depicung  what  we  believe  to  be  die  basic  mode  of 
operation  in  connectionist  systems.  One  can  imagine,  e.g.,  that  C  and  1)  are  competing  phonemes 
and  dial  A  and  H  are  words  which  incorporate  C  and  D,  respectively. 

We  have  already  described  die  graphical  notation  which  will  often  be  used  in  examples.  The 
akernauve  method  is  to  describe,  for  each  unit,  the  outgoing  connecuons  10  other  units  in  tabular 
form.  Fach  outgoing  Vj  (only  one  for  basic  units)  will  have  a  set  of  entries  of  the  form 

^receiving  umt>.<index>,< ± >,<type>) 

where  any  of  the  last  three  constructs  can  be  omitted  and  given  its  default  value.  The  <±>  field 
specifies  whether  the  link  is  excitatory  (  +  )  or  inhibitory  (-)  and  defaults  to  +.  The  <index>  is  die 
input  index  j  in  r.  at  the  receiving  end.  This  index  can  be  used  for  specifying  different  weights  as 
in  the  examples  above.  Indexed  inputs  also  allow  for  functionally  different  use  of  various  inputs  and 
many  of  our  examples  will  exploit  this  feature.  The  (type>  is  either  normal,  modifier  (m),  or 
learning  (x),  die  default  being  normal. 

For  example,  the  diagram  of  F.xample  2  could  be  replaced  by  the  table: 

A;  B.2,  - 
Z1 

B:  A, 2,  - 
Z2 

Yl:  A,  1 
Y2:  H.2 


where  units  labeled  Y,  Z,  etc.,  designate  unnamed  sources  and  sinks. 

Competing  coalitions  of  units  will  be  the  organizing  principle  behind  most  of  our  models. 
Consider  the  two  alternative  readings  of  the  Necker  cube  shown  in  Figure  5.  At  each  level  of 
visual  processing,  there  are  mutually  contradictory  units  represenung  alternative  possibilities.  The 
dashed  lines  denote  the  boundaries  of  coalitions  which  embody  llie  alternauve  interpretations  >f  the 
image.  A  number  of  interesting  phenomena  (e.g.  priming,  perceptual  rivalry,  subjective  contour) 
find  natural  expression  in  this  formalism.  We  are  engaged  in  an  ongoing  effort  |Sabbah,  1981; 
Ballard,  198 lb|  to  model  as  much  of  visual  processing  as  possible  within  the  connecuomsi 
framework.  This  paper  is  largely  an  exercise  in  developing  standard  mechanisms  for  this  and  other 
specific  modeling  projects. 

Figure  5:  Necker  Cube. 

Q- Units  and  Compound  Units 

Another  useful  special  case  arises  when  one  suppresses  the  numerical  potential,  p,  and  relies 
upon  the  finite-state  set  fq }  for  modeling.  If  we  also  identify  each  input  of  i  widi  a  separate  named 
input  signal,  we  can  gel  classical  finite  automata.  A  simple  example  would  be  a  unit  that  could  be 
started  or  stopped  from  firing. 

One  could  describe  die  behavior  of  this  unit  by  a  table,  widi  rows  corresponding  to  stales  in 
{q}  and  columns  to  possible  inputs,  e.g., 

i]  (start)  \j  (stop) 

Firing  Firing  Null 

Null  Firing  Null 

I'he  table  above  is  a  tabular  presentation  of  our  simplified  generic  function,  g  =  g(i,q)  which 
describes  slate  changes.  In  a  similar  manner,  the  computauon  v  <-  h(i.p.q)  could  be  simplified  to  v 
<-  h(q),  e.g., 

v  <-  if  q  =  Firing  then  6  else  0. 

This  could  also  be  added  to  the  table  above. 

We  have  already  employed  a  variety  of  graphical  and  textual  descriptions  of  units  and 
collections  of  them.  The  paper  will  continue  to  use  different  representauons,  but  these  are  all 
instances  of  die  general  definition.  One  of  the  most  powerful  techniques  employed  will  be 
encapsulauon  and  abslracuon  of  a  subnetwork  by  an  individual  unit.  For  example,  assume  that 
some  system  had  separate  motor  abilities  for  turning  left  and  turning  right  (e.g.  fins).  We  could  use- 
two  start-stop  units  to  model  a  turn-unit. 

Figure  6;  A  Turn  Unit. 

There  are  two  important  points.  The  compound  unit  here  has  two  distinct  outputs,  where  basic- 
units  have  only  one  (which  can  branch,  of  course).  In  general,  compound  units  will  differ  from 
basic  ones  only  in  that  they  can  have  several  disunct  outputs. 

i'he  main  point  of  this  example  is  that  the  turn-unit  can  be  described  abstractly,  independent  of 
die  details  of  how  it  is  built.  For  example,  using  the  tabular  convenuons  described  above, 


Left  Right  Values  Output 

a  gauche  a  gauche  adroit  V^=7,  V2=0 

adroit  a  gauche  adroit  Vj=0,  V2  =  8 

where  the  right-going  output  being  larger  than  the  left  could  mean  that  we  have  a  right-finned 
robot.  There  is  a  great  deal  more  that  must  be  said  about  the  use  of  stales  and  symbolic  input 
names,  about  multiple  simultaneous  inputs,  etc.,  but  the  idea  of  describing  the  external  behavior  of 
a  system  only  in  enough  detail  for  the  task  at  hand  is  at  the  core  of  our  enterprise.  This  is  one  of 
lire  few  ways  known  of  coping  with  the  complexity  of  the  magnitude  needed  for  serious  modeling 
of  biological  functions.  It  is  not  strictly  necessary  that  tire  same  formalism  be  used  at  each  level  of 
funcuonal  abstracuon  and.  in  die  long  run,  we  may  need  to  employ  a  wide  range  of  models.  For 
example,  for  certain  purposes  one  might  like  to  expand  our  units  in  terms  of  compartnrental  models 
of  neurons  like  those  of  (I’erkel,  1979j.  The  advantage  of  keeping  within  the  same  formalism  is  that 
we  preserve  intuiuon,  mathemaucs,  and  the  ability  to  use  existing  simulation  programs. 

The  idea  of  encapsulation  used  in  compound  units  is  vital,  but  one  should  not  think  drat  only  a 
small  number  of  units  are  involved  in  output;  rather,  only  a  small  fraction  of  the  units  in  the 
subsystem  are  output  units.  Some  simple  biological  systems  (such  as  in  leech  |Slem  el  al.,  1978|  or 
lobster  |Warsliaw  and  Ilartline,  1976|)  might  be  able  to  be  completely  modeled  on  dre  above  scale. 
Hut  we  are  more  concerned  here  with  complex  systems  like  human  vision,  eLc.  For  this  purpose  we 
will  need  yet  more  abstracuon  techniques  (see  below).  In  human  vision  even  loose  coupling  will 
involve  a  large  number  of  eonnecuons  between  subsystems,  e.g.  vesubular  and  vision. 

Units  Kniploying  p  anil  q 

It  will  already  have  occurred  to  die  reader  that  a  numerical  value,  like  our  p,  would  be  useful 
for  modeling  the  amount  of  turning  to  tire  left  or  right  in  the  last  example.  It  appears  to  be 
generally  true  that  a  single  numerical  value  and  a  small  set  of  discrete  states  combine  to  provide  a 
powerful  yet  tractable  modeling  unit.  This  is  one  reason  dial  the  current  definitions  were  chosen. 
Another  reason  is  that  the  mixed  unit  seems  to  be  a  particularly  convenient  way  of  modeling  the 
information  processing  behavior  of  neurons,  as  generally  described.  The  discrete  suites  enable  one  to 
model  the  effects  m  neurons  of  abnormal  chemical  environments,  fatigue,  etc.  One  example  of  a 
unit  employing  both  p  and  q  non-trivially  is  the  following  crude  neuron  model.  I  Ins  model  is 
concerned  widr  saturation  and  assumes  that  the  output  strength,  v,  is  something  like  average  firing 
frequency.  It  is  not  a  mode!  of  individual  action  potentials  and  refractory  periods. 

We  suppose  die  distinct  suites  of  the  unit  q  €  {normal,  recover}.  In  normal  stale  the  unit 
behaves  like  a  p-miil,  but  while  it  is  recovering  it  ignores  inputs.  The  following  table  captures  almost 
all  of  dus  behavior. 

-. l<p<.9  p>.9  Output  Value 

normal  p<-  p-fSi  p<-  -p/  v  <-  a  p  -  q 

(incomplete)  recover 

recover  normal  <impossible>  v  <-  0 

Here  we  have  die  change  from  one  state  to  die  other  depending  on  the  value  of  the  potenual, 
p,  radier  dian  on  specific  inputs.  The  recovering  stale  is  also  characterized  by  the  potenual  being  set 
negauve.  The  unspecified  issue  is  what  determines  the  durauon  of  die  recovering  state- -there  arc- 
several  possibiliucs.  One  is  an  explicit  dishabiiuauoti  signal  like  those  in  KandcTs  experiments 
(Kandel,  1976|.  Another  would  be  to  have  the  unit  sum  inputs  in  the  recovering  state  as  well.  The 

reader  might  want  to  consider  how  to  add  dus  to  the  lible. 

A  durd  possibility,  which  we  will  use  frequendy,  is  to  assume  that  the  potenual,  p,  decays 

toward  zero  (from  Ixilh  direcuonx)  unless  explicitly  changed.  Fxample  1  showed  how  this  implicit 

_  L  i 

decay  p  <  pyc  can  be  modeled  by  self  lnhtbiuon.  In  this  case,  the  decay  constant,  k,  would 


determine  the  length  of  the  recovery  period. 
Disjunctive  Firing  Conditions 
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It  is  both  computationally  efficient  and  biologically  realistic  to  allow  a  unit  to  respond  to  one  of 
a  number  of  alternative  conditions.  One  way  to  view  this  is  to  imagine  die  unit  having  "dendrites'' 
each  of  which  depicts  an  alternative  enabling  condition. 

Figure  7.  A  Unit  with  Disjuncuve  Inputs. 

In  terms  of  our  formalism,  this  could  be  described  in  a  variety  of  ways.  One  of  the  simplest  is 
to  define  the  potential  in  terms  of  the  maximum  of  the  three  separate  computations,  e.g., 

p  <-  p  +  Vfax(ij  +  i2,  13  +  14.  *5  +  ‘6~'7) 

It  dues  not  seem  unreasonable  (given  current  data)  to  model  the  firing  rate  of  a  unit  as  die 
maximum  of  the  rates  at  us  active  sites.  Units  whose  potenual  is  changed  according  to  die 
maximum  of  a  set  of  algebraic  sums  will  occur  frequently  in  our  specific  models. 

One  could  replace  this  unit  with  three  simple  p-units  plus  a  maximum  unit  and  get  a  similar 
effect.  (Note  that  die  potenuals  of  the  p-units  wouldn't  be  equal.)  Hut  it  appears  to  be  easier  to 
understand  and  analyze  systems  of  units  thaL  describe  intuiuvely  coherent  computations.  Another 
reason  for  employing  disjunctive  units  is  that  they  appear  to  be  wide  spread  in  nature.  'Hie  firing  of 
a  neuron  depends,  in  many  cases,  on  local  spauo-temporal  summauon  involving  only  a  small  part  of 
die  neuron's  surface.  So-called  dendritic  spikes  transmit  the  activation  to  the  rest  of  die  cell.  It  also 
turns  out  that  inhibitory  inputs  sometimes  block  such  internal  signals  that  are  upstream  of  the  point 
of  inhibition,  rather  than  just  sum  with  them.  It  is  possible  to  model  a  dendritic  tree  with  inhibitory 
blocking  inputs  all  widun  our  formalism  for  a  single  unit,  or  as  a  simple  network.  One  can  model 
each  seeuon  of  die  dendritic  tree  as  a  unit  which  sends  output  to  die  unit  body  unless  it  is  blocked 
by  a  modifier  (nij)  input,  corresponding  to  blocking  inhibition  in  neurons.  One  advantage  of 

keeping  the  processing  power  of  our  abstract  unit  close  to  that  of  a  neuron  is  that  it  helps  inform 
our  counting  arguments.  When  we  attempt  to  model  a  particular  function  (e.g.  stereopsis),  we 
expect  to  require  dial  the  number  of  units  and  connections  as  well  as  the  execution  ume  required 
by  die  model  are  plausible. 

Conjunctive  Connections 

The  max-ofsum  unit  is  die  continuous  analog  of  a  logical  OR-of-ANl)  (disjuneUve  normal 
form)  unit  and  we  will  sometimes  use  the  latter  as  an  approximate  version  of  the  foimer.  He  OK- 
of- AND  unit  corresponding  to  Figure  7  is: 

p  <-  p  +  a  OK  (i[&i2.  13&I4,  i5&i(,&(not  1-7)) 

This  formulation  stresses  die  importance  dial  nearby  spatial  connecUons  all  be  firing  before  die 
potenual  is  affected.  Hence,  in  die  above  example,  13  and  i4  make  a  conjunctive  connection  with  the 
unit. 


Change 


For  our  purposes,  it  is  useful  to  have  all  the  adaptability  of  networks  be  confined  to  changes  in 
weights.  While  there  is  known  to  be  some  growth  of  new  eonnecuons  ;n  adults,  it  does  not  appear 
to  be  fast  or  extensive  enough  to  play  a  major  role  in  learning.  For  technical  reasons,  we  consider 
very  local  growth  or  decay  of  eonnecuons  to  be  changes  in  existing  connection  patterns.  Obviously, 
models  concerned  widi  developing  systems  would  need  a  richer  notion  of  change  in  connectiomst 
networks  (cf.  (von  tier  Vlalsburg  and  Willshaw,  1977]).  Learning  and  change  will  not  be  treated 
technically  in  dus  paper,  but  the  defimuons  are  included  for  completeness.  We  provide  each  unit 
with  a  change  function  c: 

H  <-  c(i.p.q,x,;i) 


where  fi  is  the  intermediate- term  memory  vector,  i,  p,  and  q  are  as  always,  and  x  is  an  additional 
single  integer  input  (0  <  x  <  9)  which  captures  die  nouon  of  die  importance  and  value  of  the 
current  behavior.  Instantaneous  establishment  of  long-term  memory  (which  does  not  seem  plausible) 
would  be  equivalent  to  having  fi  =  w.  We  are  assuming  Lhat  die  consolidation  of  long-term  changes 
is  a  separate  process. 

We  assume  that  important,  favorable  or  unfavorable,  behaviors  can  give  rise  to  faster  learning. 
The  rationale  for  this  is  given  in  |l-'e!dman.  1980;  198 1 1.  which  also  lays  out  informally  our  views  on 
how  short-  and  long-term  learning  could  occur  m  connecliomsL  networks.  We  are  working  on  a 
more  technical  presentation  of  our  model  of  change  along  die  lines  of  this  paper.  Obviously 
enough,  a  plausible  model  of  learning  and  memory  is  a  prerequisite  for  any  serious  scienufic  use  of 
connecuonism.  Hut  we  have  found  dial  an  examination  of  networks  for  carrying  out  the  basic 
building  blocks  is  already  enough  for  one  report. 
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3.  Networks 

Our  general  idea  of  temporal  behavior  m  networks  is  that  of  relaxation.  The  independent 
inputs  together  with  die  vaiious  i filer- arm  connections  are  sufficient  to  cause  die  networks  to  behave 
in  an  appropriate  manner;  each  unit  should  converge  to  a  potenual  value  between  -  I  and  1.  Much 
work  has  been  done  on  relaxation,  from  classical  Gauss-Sidel  iterations  to  more  modern  applications 
in  vision  (e.g.  |Rosenfeld  et  ah,  1976;  Marr  and  I’oggio  1976;  Prager  19X0;  and  Hinton,  1980|).  With 
a  few  excepuons,  previous  work  has  assumed  linear  behavior  (or  linear  with  a  threshold).  As  one  of 
die  exceptions,  l’rager  used  a  non-linear  model  and  noted  dial  it  enabled  him  to  use  more 
complicated  updating  conditions  than  those  in  a  linear  system.  Our  model  also  breaks  with  die 
linear  traditions  in  its  use  of  conjunctive  connections  and  state  tables.  While  we  still  use  linear 
approximations  to  analyze  the  stability  of  die  system,  the  non-linear  units  are  closer  to  actual 
neurons  in  behavior  and  allow  vast  simplifications  in  network  design. 

Winner-  lake- All  Networks 

A  very  general  problem  that  arises  in  any  distributed  computing  situation  is  how  to  get  die 
enure  system  to  make  a  decision  (or  perform  a  coherent  action,  etc.).  This  is  a  particularly  important 
issue  for  the  current  model  because  of  its  restrictions  on  informauon  flow  and  because  of  the  almost 

linear  nature  of  the  p-unils  used  in  many  of  our  specific  examples.  One  way  to  deal  with  the  issue 

of  coherent  decisions  in  a  connecuonist  framework  is  to  introduce  winner-take-all  (WTA)  networks, 
which  have  llie  property  dial  only  the  unit  with  the  highest  potential  (among  a  set  of  contenders) 
will  have  output  above  zero  after  some  settling  time,  biologically  necessary  examples  of  dus 
behavior  abound;  ranging  from  turning  left  or  right,  through  fighi-or- flight  responses,  to 

inierpretaUons  of  ambiguous  words  and  images. 

There  are  a  number  of  ways  to  construct  W'l  A  networks  from  the  units  described  above.  We 
will  discuss  several  of  these,  both  because  of  the  importance  of  WTA  capabilities  and  because  it  is 
die  first  non  trivial  problem  treated  here.  The  quesuon  of  identical  values  (ues)  is  an  important 
one,  but  will  be  deferred  for  a  few  paragraphs.  Our  first  example  of  a  WTA  network  will  operate 
in  one  unit  step  for  a  set  of  contenders  each  of  whom  can  read  die  potenual  of  all  of  the  others. 
(The  fan  lii/out  of  neurons  is  about  1,000-10,000.)  Tech  uiui  in  the  network  computes  its  new 
potential  according  to  the  rule: 

p  <-  if  p  >  max(ij,  .1)  then  p  else  0. 

That  is,  each  unit  sets  itself  to  zero  if  it  knows  of  a  higher  input.  This  is  last  and  simple,  but 
probably  a  little  too  complex  to  be  plausible  as  the  behavior  of  a  single  neuron.  There  is  a  standard 
trick  (apparently  widely  used  by  nature)  to  convert  tins  into  a  more  plausible  scheme.  We  replace 
each  unit  above  with  two  units;  one  computes  die  maximum  of  the  competitor's  inputs  and  inhibits 
die  odier.  1  his  is  shown  in  figure  X. 

Figure  8:  Paired  Units  for  Max  WTA. 

There  are  a  number  of  remarks  in  order.  It  is  not  biologically  unreasonable  to  view  the  firing 
rale  of  a  neuron  to  be  the  maximum  of  die  rates  of  ns  separate  sues  of  spauo-lempornl  summation. 
The  circuit  above  can  be  strengdiened  by  adding  a  reverse  inhibitory  link,  or  one  could  use  a 
modifier  on  the  output,  etc.  Obviously  one  could  have  a  WTA  layer  dial  got  inputs  from  some  set 
of  competitors  and  settled  to  a  winner  when  triggered  to  do  so  by  some  downstream  network.  This 
is  an  exact  analogy  of  strobing  an  output  buffer  in  a  conventional  computer.  Another  set  of 
standard  ideas  (here  from  theoreucal  computer  science)  enables  us  to  build  WTA  networks  among 
sets  of  contenders  largei  than  die  allowable  fan-in  of  units.  We  just  arrange  die  competitors  in  a 
tournament  tree  |Aho,  Hopcrofl,  and  Ullman,  1 97-1 1  and  have  the  winners  at  each  level  play  off. 
The  time  required  is  die  height  of  the  tree  which  is  the  logarithm  (to  the  base  fan  in)  of  the  size  of 
die  set  of  contenders  and  is  small  for  all  realistic  situauons. 

The  question  of  ties  remains  to  be  considered.  Since  we  are  assuming  only  a  limited  range  of 
output  values,  quite  a  few  contenders  might  appear  to  be  equal.  Depending  on  iiow  the  W'l  A 
network  is  being  employed,  one  might  want  several  different  ways  of  ireaung  this  situation.  A 


common  idea  in  computer  science  is  to  order  the  units  in  some  way  and  have  the  first  in  order  win 
in  the  case  of  ties.  This  is  easy  to  implement  by  having  units  turn  off  if  a  predecessor  is  higher  or 
equal  to  itself.  In  some  situations,  a  random  choice  might  be  appropriate.  There  is  reason  to  believe 
dial  essentially  random  effects  break  Ues  in  real  neural  networks.  Randomness  can  be  achieved  in 
our  scheme,  e.g.,  by  adding  a  randomly  changing  hierarchy,  but  we  will  not  be  using  random 
selection  in  this  paper. 

There  are  two  more  basic  ways  of  treating  ties  that  deserve  mention.  One  could  try  to  resolve 
ties  by  looking  ever  more  closely  at  the  values  of  potenual  among  contenders.  This  would  amount 
to  having  "rounds"  of  competition.  First,  all  units  whose  high-oider  digit  was  sub-maximal  would 
drop  out.  Then  there  would  be  a  play-off  based  on  die  second  digit,  etc.  This  could  be  combined 
with  die  tournament  tree,  but,  in  the  end,  one  sull  might  have  to  contend  with  ues.  There  do  seem 
to  be  situations  where  some  fine- tuning  is  called  for,  but  die  most  common  situation  appears  to  be 
quite  different. 

Recall  dial  the  purpose  of  WTA  networks  was  to  identify  a  clear  winner  out  of  a  set  of 
contending  actions,  perccpuons,  etc.  Roth  in  nature  and  in  our  models,  dns  rarely  occurs  in  the 
form  of  pure  compeuuon  among  a  single  layer  of  contenders,  for  example,  the  choice  of  which 
word  should  be  assumed  to  have  been  heard  is  influenced  by  phonemic,  semantic,  contextual  and 
general  considerations.  We  believe  that  WTA  type  structures  exist,  but  that  drey  are  normally  part 
of  coalitions  spanning  many  layers.  Ties  in  a  single  WIA  layer  do  not  require  specific  resolution 
because  the  coalition  mteracuon  normally  will  produce  a  unique  overall  winner.  The  idea  of 
coaliuons  among  members  of  different  compeung  layers  was  discussed  briefly  in  Hxample  3  and  will 
receive  a  great  deal  of  attention  below. 

Vlulu-layer  coalition  networks  could  employ  VI AX -based  WTA  circuits,  but  it  often  seems 
more  appropriate  to  algebraically  combine  the  outputs  of  units.  Tor  this  reason,  and  to  Lie  in  widi 
some  important  related  work,  we  will  now  consider  WTA  circuits  based  primarily  on  p-unils  (which 
algebraically  sum  their  inputs). 

First  we  present  an  abstract  soluuon  of  die  WTA  problem  which  ignores  quantization  and 
bounds.  Suppose  we  have  a  symmetric  network  of  n  +  1  p-units,  each  of  which  equally  inhibits  all 
die  olliers,  i.e., 

p  <-  p  -  1/lOn  XVj  +  ?  ;  (v  =  1  Op) 

J*k 

If  we  add  one  extra  unit,  AVH,  which  computes  the  average  of  all  active  (non-zero)  outputs  and 

feeds  it  (widi  +  polarity)  to  all  the  units,  we  gel  die  desired  subnetwork. 

p  <-  p  -  l/10n  XVj  +  l/10b  EVj 

j*k  b 

where  b  is  die  number  of  non-zero  inputs  to  AVH. 

This  network  has  the  required  behavior  because  each  unit  has  its  potential  increased  by  die 
difference  between  the  average  of  all  outputs  and  the  average  of  all  but  its  own  output.  Units  whose 
output  is  above  average  will  increase  while  the  others  decrease.  As  units  go  to  zero  and  drop  out, 
more  units  go  below  average.  One  instance  of  this  would  be  when  a  subnetwork  widi  all  Pj  initially 
equal  got  outside  signals  which  favored  one  unit.  Notice  that  AVH  is  not  a  p-utiil,  since  it  counts 
non -zero  inputs. 

A  possible  problem  arises  if  one  takes  saturauon  into  account.  If  two  units  were  both  near 
saturation,  they  might  easily  both  reach  saturation  before  the  W  I  A  network  settled  down.  Fur  any 
network  there  will  be  a  difference  so  small  that  die  intent  is  that  the  two  values  are  considered 
identical.  For  diffeiences  larger  than  this,  one  can  design  the  WI  A  network  to  converge  slowly 
enough  to  prevent  multiple  unequal  units  from  reaching  saturation.  I  his  is  accomplished  by  giving 
less  weight  to  the  positive  input  from  the  AVH  unit,  sull  assuming  that  the  output  can  have 


14 


continuous  values. 

Quantization  of  output  values  (here  0...9)  adds  interesting  additional  issues.  For  a  sufficiently 
large  network,  ten  distinct  values  will  not  be  enough  to  resolve  the  difference  between  the  two 
averages.  There  are  a  variety  of  compulauonal  tricks  to  exploit  die  limited  dynamic  range  available. 
Some  of  these,  like  tournament  trees  and  successive  digit  comparisons,  were  mentioned  in  die 
discussion  of  ties.  Hut  die  restricuon  to  simple  signals  is  at  the  heart  of  our  approach  and  should  not 
be  evaded.  We  should  not  build  models  in  which  WTA  networks  involve  a  large  number  of 
alternauves  nor  should  we  expect  very  delicate  decisions  lo  be  made  by  a  single  compedtive 
network. 

The  Question  of  Delicacy 

One  problem  widi  previous  neural  modeling  attempts  is  that  die  circuits  proposed  were 
unnaturally  delicate  (unstable).  Small  changes  in  parameter  values  would  cause  the  networks  to 
oscillate  or  converge  to  incorrect  answers.  We  will  have  to  be  careful  not  to  fall  into  dus  trap,  but 
would  like  to  avoid  detailed  analysis  of  each  parucular  model  for  delicacy.  What  appears  to  be 
required  are  some  building  blocks  and  combination  rules  diat  preserve  die  desired  properties.  For 
example,  the  WTA  subnetworks  of  the  last  example  will  not  oscillate  in  the  absence  of  oscillating 
inputs.  This  is  also  true  of  any  symmetric  mutually  inhibitory  subnetwork.  This  is  intuitively  clear 
and  could  be  proven  rigorously  under  a  variety  of  assumptions  (cf.  |Grossberg,  19X0|). 

One  useful  principle  is  the  employment  of  lower-bound  and  upper-bound  cells  to  keep  the  total 
activity  of  a  network  widnn  bounds.  'The  idea  is  an  extension  of  die  A V H  cell  used  in  the  W  TA 
example.  Suppose  that  we  add  two  extra  units,  I.H  and  UB,  to  a  network  which  has  coordinated 
output.  The  I.H  cell  compares  die  total  (sum)  acuvity  of  the  units  of  the  network  widi  a  lower 
bound  and  sends  positive  acuvauon  uniformly  to  all  members  if  the  sum  is  loo  low.  The  UB  cel! 
inhibits  all  units  equally  if  the  sum  of  activity  is  too  high.  Notice  that  I.H  and  UB  can  be 
parameters  sei  from  outside  the  network.  Under  a  wide  range  of  conditions  (but  not  all),  the  I.B- 
UB  augmented  network  can  be  designed  to  preserve  order  relauonslups  among  the  outputs  Vj  of  the 
original  network  while  keeping  die  sum  between  I.H  and  UB. 

We  will  often  assume  that  I.D-UH  pairs  are  used  to  keep  the  sum  of  outputs  from  a  network 
within  a  given  range.  This  same  mechanism  also  goes  far  towards  eliminating  the  twin  perils  of 
uniform  saturation  and  uniform  silence  which  can  easily  arise  in  mutual  inhibition  networks.  Thus 
we  will  often  be  able  to  reason  about  the  compulation  of  a  network  assuming  that  it  stays  acuve 
and  bounded.  We  also  require  (hat  individual  units  be  viewed  as  part  of  different  subnetworks, 
which  may  be  simultaneously  active.  The  general  issue  of  interacung  subnetworks  entails  nothing 
less  than  die  whole  enterprise,  but  we  can  tackle  the  quesuon  of  bounds.  If  we  view  each  output 
value  v.  in  a  set  of  networks  comprising  n  -  units  as  the  axis  of  an  n-dimensional  space,  the  UB  and 
I.H  celts  correspond  to  bounding  hyper-planes  in  this  space.  The  simultaneous  imposiUon  of  these 
condiuons  defines  a  convex  hull,  in  which  the  solution  must  lie.  (Geoff  Hinton  pointed  this  out.) 
This  could  turn  out  to  have  singularities  if  some  simultaneous  solutions  are  impossible,  but  dns 
condition  can  be  checked  for  in  advance. 

One  problem  with  the  AVH  and  UB-l.B  soluuons  is  that  they  assume  dial  these  units  can 
compute  all  of  the  activity  of  a  network.  As  we  have  menuoned,  the  saturation  of  polenua!  and 
limited  daia  transfer  late  mean  dial  only  an  appioximation  is  possible  for  networks  of  significant 
size.  Odier  results  in  die  literature  (e.g.  |Grossberg,  !9X0|)  have  similar  limitations.  We  will  have  lo 
place  less  reliance  on  precise  calculations  by  large  networks  and  more  on  cooperative  compulation. 

Stable  Coalitions 

Tor  a  massively  parallel  system  like  we  are  envisioning  lo  actually  make  a  decision  (or  do 
somediing),  diere  will  have  to  be  suites  in  which  some  activity  strongly  dominates.  We  have  shown 
some  simple  instances  of  dus,  in  examples  2  and  3  and  the  WTA  network.  Hut  the  general  idea  is 
dial  a  very  large  complex  subsystem  must  stabilize,  e.g.  to  a  fixed  iiilerprcUiUon  of  visual  input. 
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The  way  we  believe  this  to  happen  is  through  mutually  reinforcing  coalitions  which  instantaneously 
dominate  all  rival  activity,  lire  simplest  case  of  this  is  Kxample  3,  where  the  two  units  A  and  11 
form  a  coalition  which  suppresses  C  and  I).  Phenomenologically,  the  two  renderings  o I  the  Meeker 
Cube  m  Figure  5  can  be  viewed  as  alternative  stable  coalitions.  Formally,  a  coalition  will  he  called 
stable  when  the  output  of  all  of  its  members  is  non-decreasing. 

What  can  we  say  about  the  conditions  under  which  coalitions  will  become  and  remain  stable? 
We  will  begin  informally  with  an  almost  trivial  condition.  Consider  a  set  of  units  ja,b„..}  which  we 
wish  to  examine  as  a  possible  coalition,  n.  For  now,  we  assume  that  the  units  in  v  are  all  p-uniis 
and  are  in  the  non-saturated  range  and  have  no  decay.  Ilius  for  each  u  in  w, 

p(u)  <-  p(u)  +  Fxc  -  Inh, 

where  Fxc  is  tire  weighted  sum  of  excitatory  inputs  and  Inh  is  the  weighted  sum  of  inhibitory 

inputs.  Now  suppose  that  Kxc|w.  the  excitation  from  the  coalition  w  only,  were  greater  than  INI  I, 

lire  largest  possible  inhibition  receivable  by  u,  for  each  unit  u  in  t r,  i.e.. 

(SC)  V  u  <E  v  ;  Fxclw  >  [Nil 

Then  it  follows  drat 

V  ii  €  v  ;  p(u)  <-  p(u) ■+  8  where  5  >  0. 

That  is,  die  potenual  of  every  unit  in  the  coalition  will  increase.  This  is  not  only  true 

instantaneously,  but  remains  true  as  long  as  nothing  external  changes  (we  are  ignoring  state  change, 
saturation,  and  decay).  This  is  because  F.xc|ir  conunues  to  increase  (recursively)  us  the  potential  of 
die  members  of  rr  increases.  Taking  saturation  into  account  adds  no  new  problems;  if  all  of  the 
units  in  v  are  saturated,  the  change,  8,  will  be  zero,  but  the  coalition  will  remain  stable. 

The  condiuon  that  die  excitation  from  other  coalmen  members  alone,  Fxc|w,  be  greater  than 
any  possible  inhibition  INI  I  for  each  unit  may  appear  to  be  too  strong  to  be  useful.  Observe  first 
that  INH  is  directly  computable  from  die  description  of  the  unit;  it  is  the  largest  negative  weighted 
sum  possible.  If  inhibition  in  our  networks  is  mutual,  the  upper-bound  possible  after  a  fixed  time  r, 
INI  It,  will  depend  on  the  current  value  of  potential  in  each  unit  u.  Hie  simplest  case  of  this  is 
when  two  units  are  "deadly  rivals"- -each  gets  all  us  mhibiuon  from  the  other.  In  such  cases,  it 
may  well  be  feasible  ui  show  that  alter  some  time  r,  the  stable  coalition  condition  will  hold  (in  die 
absence  of  decay,  fatigue,  and  changes  external  to  the  network). 

There  are  a  number  of  interesung  properties  of  the  stable  coalition  principle.  First  notice  that  it 
does  not  prohibit  multiple  suable  coaliuons  nor  single  coalitions  which  contain  uni  is  which  mutually 
inhibit  one  another  (although  excessive  mutual  inhibition  is  precluded).  If  die  units  in  the  coalition 
had  non-zero  decay,  die  coalition  excitation  Hxcjw  would  have  to  exceed  both  INI!  and  decay  for 
die  coalition  to  be  stable.  We  suppose  dial  a  stable  coalition  yields  control  when  ics  input  elements 
change  (faugue  and  explicit  resets  are  also  feasible).  To  model  coalitions  with  changeable  inputs,  we 
could  add  boundary  elements,  whose  condiuon  was 

Fxc|it  +  input  >  INH 

and  which  could  disrupt  the  coahuon  if  us  Input  went  too  low. 

An  Artificial  Kx  ample 

The  coalitions  of  units  needed  to  model  biologically  inicrcsUng  functions  will  be  large  and 
heterogeneous.  We  do  not  yet  have  malhcmaucal  results  drat  enable  us  to  characterize  die  behavior 
of  general  coalitions.  What  we  can  do  now  is  develop  an  artificial  example  of  a  coalition  and 
establish  the  critical  aspects  of  ils  behavior.  This  has  proven  to  be  useful  to  tis  both  in  aiding 
liiluiUon  and  in  constraining  the  choice  of  weights  for  real  models.  (Paul  Shields  of  U.  Toledo  and 
Stanford  provided  the  basic  analysis.) 

I  he  amlietal  coalition  consists  of  M+  1  rows,  each  of  which  has  N  +  ]  units  which  compete  by 
mutual  lateral  inhibition  (Figure  9).  We  are  assuming  here  that  each  unit  can  have  potential  and 
output  values  of  unlimited  larige  and  accuracy  and  the  output  is  exactly  the  potential.  I  las  makes  it 
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passible  to  express  tire  competition  in  a  row  as  a  strictly  linear  rule: 

X,nn  <-  Xmn  -  /fiIXni  +  Coalition  Support 

figure  9:  Artificially  Symmetric  Coalition  Structure 

If  we  further  assume  that  each  coalition  ts  exactly  a  column  anti  provides  positive  support 
proportional  to  die  sum  of  us  members,  the  rule  becomes 

(  )  \im  ^  ’  ^'mn  +  a^*in  ~  /^*mj 

i*m  j*n 

Under  all  these  assumptions  (*)  defines  a  linear  transformauon.  T,  on  the  collection  of  values 
X,nn  viewed  as  a  "vector"  in  the  sense  of  linear  algebra.  This  transformauon  is  sufficiently  regular 

dial  we  can  characterize  all  of  its  eigenvalues  and  eigen  "vectors."  Recall  that  an  eigenvalue,  A,  and 
die  associated  "vector"  X  have  the  property  that  IX  =  XX.  Any  such  coatiuon  structure,  X.  will  lie 
stable  because  repealed  applicauons  of  the  relaxauon  rule  (*)  will  just  mulUply  every  element 
repeatedly  by  die  related  A.  What  is  inleresung  here  is  that  the  configurauons  of  Xmn  which  have 
this  properly  are  easy  to  discuss  in  terms  of  our  model. 

Suppose  that  Xnin  were  such  that  each  column  had  every  one  of  its  elements  equal.  This  might 

be  a  good  resting  suite  for  the  structure  because  any  row  would  provide  die  same  answer  as  to  the 
relative  strengths  of  the  various  possibiliues.  The  rule  (*)  becomes: 

*mn  *mn  +  a'M'Xmn  -  /fXXmj 

because  all  M  other  elements  of  its  column  are  equal  to  Xmn.  If  we  further  assume  that  each  row 
has  the  sum  of  all  its  elements  equal  to  zero,  the  remaining  summation  above  must  be  equal  to  - 
Xmn  and  we  get: 

*mri  *mn  +  ®  M  *mn  +  /^Xnin 

or 

*mn  ^  +  a‘^  +  /^*mn 

which  says  that  (I  +  onVf  +  /i)  =  A^  is  an  eigenvalue  for  T,  working  on  "vectors"  with  constant 
columns  and  zero  row-sums.  The  condiuon  of  a  zero  sum  for  a  row  captures  the  idea  of 
compeuuon  quite  nicely;  die  fact  that  this  requires  negauve  values  to  be  transmitted  is  not  a  serious 
problem.  It  is  the  assumptions  of  unbounded  scale  and  accuracy  that  limit  the  application  of  these 
results  even  in  the  case  of  purely  row-column  coalition  structures. 

The  fact  that  constant-column,  zero-row-sum  configurnuons  are  stable  for  this  structure  is 
important,  but  there  are  several  other  points  to  be  made.  Nouce  that  several  columns  could  have 
the  same  constant  value;  the  problem  of  lies  cannot  be  resolved  by  such  a  uniform  system.  There 
are  also  odier  eigenvalues  and  "vectors"  which  do  not  correspond  to  desirable  states  of  the  system. 
These  are; 


(eigenvalue  "vector"  X 

1  +  aM  -  (i N  matrix  of  all  1 

1  -  a  -  [i N  rows  equal,  column-sum  zero 

1  -  a  +  (i  row-sum  and  column-sums  all  zero 

Hy  computing  die  multiplicity  of  the  four  eigenvalues,  one  can  show  that  the  total  multiplicity 
is  (N-l-  l)(M  +  1),  so  that  diere  are  no  other  eigenvalues.  The  critical  point  is  that  powers  of  a  linear 
system  like  T  will  converge  to  die  direction  specified  by  its  largest  eigenvalue.  If  we  make  sure  to 
choose  a  and  /{  so  diat  X+  =  1  +  aM  -l-  /i  is  the  largest  eigenvalue,  then  repetitions  of  (*)  will 
converge  to  the  desired  constant-column  zero-row-sum  slate.  This  requires  (for  a,  fi  positive)  dial 
1  +  aM  +  /t  >  a  +  /IN  -1 
or 

(**)  2  >  //N  -  aM  +  (a  -  /i). 

We  can  ignore  a  -  //  which  is  a  small  fraction.  Recall  that  fi  is  the  weight  given  to  the  compeutors 
and  a  the  weight  given  to  collalxiralors.  Condition  (**)  stales  that  if  die  coalitions  are  given 
adequate  weight,  the  system  will  settle  into  a  suite  with  uniform  columns  (coalitions).  The  obvious 
choice  of  /f  =  1/N  and  a  =  1/M  comfortably  meets  condiuon  (**).  The  problem  that  occurs  if  /{ 
is  too  small  is  that  mutual  inhibition  will  have  no  effect  and  the  system  will  converge  to  the  state 
where  all  columns  have  their  initial  average  value.  The  relative  importance  of  competition  and 
collalxiralion  will  be  a  crucial  part  of  die  detailed  specification  of  any  model.  There  appears  to  be 
no  reason  that  discrete  values,  bounded  ranges  and  overlapping  coalitions  should  change  the  basic 
character  of  this  result,  but  the  detailed  analysis  of  a  reahsuc  coalmen  structure  for  us  convergence 
properues  appears  to  be  very  difficult.  More  generally,  there  will  need  to  be  ways  of  assessing  die 
impact  of  finite  bounds  and  discrete  ranges  on  systems  whose  continuous  approximation  is 
understood,  a  classic  problem  in  numerical  analysis. 
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4.  Distributed  (Massively  Connected)  Computing 

The  main  restriction  imposed  by  the  connectionist  paradigm  is  that  no  symbolic  information  is 
passed  from  unit  to  unit.  This  restriction  makes  it  difficult  to  employ  standard  computauonal 
devices  like  parameterized  functions.  In  this  section,  we  present  connectionist  solutions  to  a  variety 
of  computational  problems. 

Using  a  Unit  to  Represent  a  Value 

A  cornerstone  of  our  approach  is  the  dedicauon  of  a  separate  unit  to  each  value  of  each 
parameter  of  interest,  which  we  term  the  unii/value  principle.  We  will  show  how  to  compute  using 
unil/value  networks  and  present  arguments  that  die  number  of  units  required  is  not  unreasonable. 
In  tins  representation  die  output  may  be  thought  of  as  a  confidence  measure.  If  a  unit  represenung 
depth  =  2  saturates  then  die  network  is  expressing  confidence  that  the  distance  of  some  object  from 
die  retina  is  two  depdi  units.  There  is  much  neurophysiological  evidence  to  suggest  unit/ value 
organizations  in  less  abstract  cortical  orgamzauons.  (-samples  are  edge  sensitive  units  (Hubei  and 
Wiesel,  19791  and  peieeptual  color  units  |/.eki,  1980),  which  are  relatively  msensiuve  to  illumination 
spectra.  I'.xpeiimenls  with  cortical  motor  control  in  die  monkey  and  cat  | Wuruz  and  Albano,  1980| 
indirectly  hint  at  a  unii/value  orgamzauon.  Our  hypodiesis  is  that  die  unit/value  organization  is 
widespread,  and  is  a  fundamental  design  principle. 

Aldiotigh  many  physical  neurons  do  seem  to  follow  the  unii/value  rule  and  respond  according 
to  die  reliability  of  a  particular  configuration,  dicrc  are  also  other  neurons  whose  output  represents 
die  range  of  some  parameter,  and  apparently  some  units  whose  firing  frequency  reflects  both  range 
and  strength  information,  Uodi  of  the  latter  types  cun  be  accommodated  within  our  dcliniuon  of  a 
unit,  but  we  will  employ  only  unii/value  cells  in  the  remainder  of  this  paper. 

In  the  unii/value  representation,  mtieii  compuuiuon  is  done  by  table  look-up.  Previous  ideas 
such  as  WTA  networks,  scaling  networks,  and  deadly  rivals  still  apply;  diey  describe  die  dynamic 
behavior  of  die  table.  Ilea-  sve  discuss  die  implicauons  of  the  tables  themselves,  which  are  at  the 
core  of  what  we  mean  by  computing  with  connections. 

As  a  simple  example,  let  us  consider  the  multiplication  of  two  variables,  i.e.,  z  =  xy.  In  the 
unii/value  formalism  dic-re  will  be  units  for  every  value  of  x  and  y  that  is  important.  Appropriate 
pairs  of  diese  will  make  a  connection  with  another  unit  cell  represenung  a  specific  value  for  die 
product,  l-'iguic  10  shows  this  for  a  small  set  of  units  represenung  values  for  x  and  y.  Nouce  that 
die  confidence  (expressed  as  output  value)  dial  a  particular  product  is  an  answer  is  a  linear  funcuon 
of  the  maximum  of  die  sums  of  the  confidences  of  us  two  inputs.  Note  that  die  number  of  xy  units 
need  not  lie  as  laige  as  the  product  of  die  number  of  x  and  y  inputs  for  the  table  to  be  useful. 
Furthermore.  the  x  and  y  inputs  make  conjunctive  connecUons  with  their  z-tinit. 

Figure  10:  Computing  with  Table  Look-Up  Units. 


Modifiers  and  Mappings 

I'lie  idea  of  fiuicUon  tallies  can  be  extended  through  die  use  of  variable  mappings.  In  our 
definition  of  die  computational  unit,  we  included  a  binary  modifier,  m,  as  an  opuon  on  every 
connection.  As  the  definition  specifies,  if  the  modifier  associated  with  a  connection  is  zero,  the  value 
v  sent  along  that  connccuon  is  ignored.  There  is  considerable  evidence  in  nature  for  synapses  on 
synapses  and  the  modifiers  add  greatly  to  the  computational  simplicity  of  our  networks.  Let  us  start 
with  an  imual  informal  example  of  the  use  of  modifiers  and  mappings.  Suppose  you  wanted  to 
ignore  the  telephone  in  your  office,  but  answer  it  at  home.  One  intuitive  way  to  do  this  is  shown  by 
I-igure  11. 


Figure  11:  Modifier  (rrij)  on  a  Connecuon. 
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The  circular  connection  between  links  denotes  a  binary  modifier.  You  probably  don’t  want  to 
inhibit  your  own  uuuation  of  phone  calls  from  the  office,  just  the  link  between  the  ring  and  your 
action.  Of  course,  there  are  ways  of  encoding  this  without  using  modifiers,  but  it  is  easy  to  see  how 
modifiers  permit  whole  behavior  patterns  to  depend  on  a  state  change.  My  convention,  we  will 
assume  dial  a  modifier  blacks  the  connection  when  its  source  unit  is  active.  Technically,  m  <-  if  v 
=  0  then  1  else  0.  where  v  is  the  output  value  of  the  unit  which  is  the  source  of  m. 

A  slightly  more  complex  use  of  mappings  is  for  disjunctions.  Suppose  that  one  has  a  model  of 
grass  as  green  except  in  California  where  it  is  brown  (golden). 

Figure  12.  Grass  is  Green  Connection  Modified  by  California. 

Mere  we  can  see  that  grass  and  green  are  potential  members  of  a  coalition  (can  reinforce  one 
another)  except  when  the  link  is  blocked.  This  use  is  similar  to  the  cancellation  link  of  |Fahlman, 
19791  and  gives  a  crude  idea  of  how  context  can  effect  perception  in  our  models.  Note  that  in 
Figures  11  and  12  we  are  using  a  shorthand  notauon.  A  modifier  touching  a  double-ended  arrow 
actually  blocks  two  connections.  (Sometimes  we  also  omit  the  arrowheads  when  connection  is 
double-ended.) 

Mappings  can  also  be  used  to  select  among  a  number  of  possible  values.  Consider  the  example 
of  die  relation  between  depth,  physical  size,  and  reunal  size  of  a  circle.  (For  now,  assume  liiat  the 
circle  is  centered  on  and  orthogonal  to  the  line  of  sight,  that  the  focus  is  fixed,  etc.)  Then  there  is  a 
fixed  relation  between  die  size  of  retinal  image  and  the  size  of  the  physical  circle  for  any  given 
depdi.  That  is,  each  depth  specifies  a  mapping  from  reunal  to  physical  si/.c,  i.e„ 

Figure  13:  Depdi  Network. 

Here  we  suppose  the  scales  for  depth  and  the  two  sizes  are  chosen  so  that  unit  depth  means  the 
same  numerical  size.  If  we  knew  the  depth  of  the  object  (by  touch,  context,  or  magic)  we  would 
know  us  physical  size.  Hie  network  above  allows  reunal  size  2  to  reinforce  physical  size  2  when 
depth  =  1  but  inhibits  tins  connection  for  all  odier  depths.  Similarly,  at  depth  3,  we  should 
interpret  reunal  size  2  as  physical  size  8,  and  inhibit  other  interpretations.  Several  remarks  are  in 
order.  First,  notice  dial  this  network  implements  a  function  pliys  =  f(ret,dep)  dial  maps  from 
reunal  size  and  depth  to  physical  size,  providing  an  example  of  how  to  replace  functions  with 
parameters  by  mappings.  For  the  simple  case  of  looking  at  one  object  perpendicular  to  the  line  of 
sight,  dtere  will  be  one  consistent  coalition  of  units  which  will  be  stable.  The  network  does 
somedung  more,  and  this  is  crucial  to  our  enterprise;  the  network  can  represent  die  consistency 
relation  R  among  the  three  quanuues:  depth,  reunal  size,  and  physical  size.  It  embodies  not  only 
die  fuucuon  f,  but  us  two  inverse  functions  as  well  (dep  =  f j( tct.phys),  and  ret  =  f2(phys,dep)). 
(The  network  as  shown  does  not  include  the  links  for  fj  and  f->,  but  these  are  similar  to  those  for  f.) 

Most  of  Section  5  is  devoted  to  laying  out  networks  that  embody  theories  of  pnrucular  visual 
consistency  relations. 

The  idea  of  modifiers  is,  in  a  sense,  complementary  to  that  of  conjunctive  connections.  For 
example,  die  network  of  Figure  13  could  be  transformed  into  the  following  network  (Figure  14). 

Figure  14:  An  Alternate  Depth  Network. 

In  dus  network  die  variables  for  physical  size,  depth,  and  reunal  size  are  all  given  equal  weight.  For 
example,  physical  size  =  4  and  depdi  =  1  make  a  conjunctive  connection  with  retinal  size  =  4. 
Faeh  of  the  variables  may  also  form  a  separate  W  l'A  network;  hence  rivalry  for  different  depth 
values  can  be  settled  via  inhibitory  connections  in  die  depth  network. 

To  see  how  the  conjunctive  connection  strategy  works  in  general,  suppose  a  constraint  relation 
to  be  satisfied  involves  a  variable  x,  e.g.,  I]x,y,z,w)  =  0.  For  a  particular  value  of  x,  there  will  be 
triples  of  values  of  y,  z,  and  w  that  satisfy  the  relation  f.  Kach  of  these  triples  should  make  a 
conjunctive  connection  with  the  unit  representing  the  x-value.  There  could  also  be  3-inpul 


conjunctions  at  each  value  of  y,z,w.  Hach  of  these  four  different  kinds  of  conjunctive  connections 
corresponds  to  an  interpretation  of  the  relation  fl(x,y,z,w)  =  0  as  a  function,  i.e.,  x  =  fj(y,z,w),  y  = 
f2(x,z,w),  z  =  i^x.y.w),  or  w  =  f4(x,y,z).  Of  course,  these  funcuons  need  not  be  single-valued.  This 
network  connection  pattern  could  be  extended  to  more  than  four  variables,  but  high  numbers  of 
variables  would  tend  to  increase  its  sensitivity  to  noisy  inputs.  Hinton  has  suggested  a  special 
notation  for  the  situation  where  a  network  exactly  captures  a  consistency  relation.  The  mutually 
consistent  values  are  all  shown  to  be  centrally  linked  (see  figure  15). 

Figure  15:  Hinton's  Notation. 

When  should  a  relation  be  implemented  with  modifiers  and  when  should  it  be  implemented 
with  conjunctive  connections?  A  simple,  non-rigorous  answer  to  this  question  can  be  obtained  by 
examining  the  size  of  two  sets  of  units:  (1)  the  number  of  units  that  would  have  to  be  inhibited  by 
modifiers;  and  (2)  die  number  of  units  that  would  have  to  be  reinforced  with  conjuncuve 
connections.  If  (1)  is  larger  than  (2),  then  one  should  choose  modifiers;  otherwise  choose 
conjunctive  connections.  Sometimes  the  choice  is  obvious:  to  implement  the  brown  Californian 
grass  example  of  Figure  12  with  conjunctive  connections,  one  would  have  to  reinforce  all  units 
representing  places  drat  had  green  grass!  Clearly  in  this  case  it  is  easier  to  handle  the  exception 
with  modifiers.  On  die  other  hand,  die  depth  relauon  K(phy.dep.rei)  is  more  cheaply  implemented 
with  conjunctive  connections. 

In  physical  neurons,  there  is  a  feature  that  makes  modifiers  more  powerful  than  our  examples 
suggest.  Inhibitory  connections  can  block  inputs  from  entire  dendritic  subtrees,  and  diis  could 
simplify  certain  networks. 

Time  and  Sequence 

Connecuonist  models  do  not  initially  appear  to  be  well-suited  to  representing  changes  with 
urrie.  The  network  for  computing  some  function  can  be  made  quite  fast,  but  it  will  be  fixed  in 
functionality.  T he/e  are  two  quite  different  aspects  of  ume  variability  of  connecuonist  structures  to 
discuss:  long-term  modification  of  the  networks  (through  changing  weights)  and  short-term  changes 
in  the  behavior  of  a  fixed  network  with  ume.  There  arc  a  number  of  biologically  suggested 
mechanisms  for  changing  die  weight  (wj)  of  synapuc  connections,  but  none  of  them  are  nearly  rapid 
enough  to  account  for  our  ability  to  hear,  read,  or  speak.  The  ability  to  perceive  a  lime-varying 
signal  like  speech  or  to  integrate  the  images  from  successive  fixations  must  be  achieved  (according 
to  our  dogma)  by  some  dynamic  (electrical)  activity  in  the  networks. 

As  usual,  we  will  present  computational  solutions  to  these  problems  that  appear  to  be  consistent 
with  known  structural  and  performance  constraints.  These  are,  again,  too  crude  to  be  taken  literally 
blit  do  suggest  dial  connecuonist  models  can  describe  the  phenomena.  As  a  first  example,  consider 
die  problem  of  controlling  a  simple  physical  mouon,  such  as  throwing  a  ball.  It  is  not  hard  to 
imagine  that  for  a  skilled  motor  performance  we  have  a  fixed  sequence  of  unit-groups  dial  fire  each 
other  in  succession,  leading  to  die  motor  sequence.  The  computational  problem  is  that  there  is  a 
unique  set  of  effector  units  (say  at  the  spinal  level)  dial  must  receive  input  from  each  group  at  die 
right  ume. 


Figure  16:  A  Simple  Sequencer  Using  Modifiers. 

Figure  16  depicts  a  situation  where  two  effectors,  ej  and  c^,  get  activity  from  four  sequential 
groups  of  diree  units  each.  At  odd  intervals,  the  middle  layer  masks  the  upper  connections,  and  at 
even  intervals,  die  lower.  We  assume  dial  each  column  gels  activated  synchronously  and  in  order. 
The  main  point  is  dial  a  succession  of  outputs  to  a  single  effector  set  can  be  modelled  as  a  sequence 
of  ume-exclusive  groups  represenung  instantaneous  coordinated  signals.  Moving  from  one  tune  step 
to  the  next  could  be  controlled  by  pure  timing,  or  (more  realistically  in  many  cases)  by  a 
proprioceptive  feedback  signal.  There  is,  of  course,  an  enormous  amount  more  lhan  this  to  moior 
control,  and  realistic  models  would  have  to  model  force  control,  ballistic  movements,  gravity 
compensation,  etc. 
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The  sequencer  model  lor  skilled  movements  was  greatly  simplified  by  the  assumption  that  the 
sequence  of  activities  was  pre-wired.  How  could  we  (still  ctudely,  of  cotirse)  model  a  situation  like 
speech  perception  where  there  is  a  largely  unpredictable  time-varying  compulation  to  be  carried  out. 
The  idea  here  is  to  combine  the  sequencer  model  of  Figure  16  with  a  simple  vision  -  like  scheme. 
We  assume  that  speech  is  recognized  by  being  sequenced  into  a  buffer  of  about  the  length  of  a 
phrase  and  then  is  relaxed  against  context  in  the  way  described  above  for  vision.  For  simplicity,  we 
will  assume  that  there  are  two  identical  buffers,  each  having  a  pervasive  modifier  (inj)  innervation 

so  that  either  one  can  be  switched  into  or  out  of  Us  connections.  We  are  parucularly  concerned  with 
the  process  of  going  from  a  sequence  of  potential  phonemes  into  an  interpreted  phrase.  Figure  17 
gives  an  idea  of  how  this  might  happen. 

Figure  17:  A  I'heneme  Sequence  buffer. 

We  assume  that  there  is  a  separate  unit  for  each  potenual  phoneme  for  each  time  step  up  to 
tile  length  of  the  buffer.  The  network  which  analyzes  sound  is  connected  idenucally  to  each  column, 
but  conjunction  allows  only  the  connections  to  the  acuve  column  to  transmit  values.  Under  ideal 
circumstances,  at  each  uine  step  exactly  one  phoneme  unit  would  be  active.  A  phrase  would  then  be¬ 
layed  out  on  die  buffer  like  an  image  on  die  "mind's  eye,"  and  the  analogous  kind  of  relaxation 
cones  involving  morphemes,  words,  etc.,  could  be  brought  to  bear.  The  more  realistic  case  where 
sounds  are  locally  ambiguous  presents  no  additional  problems.  We  assume  that,  at  each  time  step, 
die  various  competing  phonemes  get  varying  activation.  Diphone  constraints  could  be  captured  by 
(+  or  -)  links  to  die  next  column  as  suggested  by  Figure  17.  We  are  now  left  with  a  multiple 
possibility  relaxation  problem- -again  exactly  like  that  in  visual  perception.  The  fact  that  each 
potential  phoneme  could  be  assigned  a  row  of  units  is  essential  to  dus  solution;  we  do  not  know 
how  to  make  an  analogous  model  for  a  sequence  of  sounds  which  cannot  be  clearly  categorized  and 
combined.  Recall  that  the  purpose  of  this  example  is  to  indicate  how  time-varying  input  could  be 
treated  in  con nectio next  models.  T  he  problem  of  actually  laying  out  detailed  models  for  language 
skills  is  enormous  and  our  example  may  or  may  not  be  useful  in  its  current  form.  Some  of  the 
considerations  that  arise  in  distributed  modeling  of  language  skills  are  presented  in  | Arbib  and 
Caplan,  197 9|. 

Conserving  Connections 

It  is  currently  estimated  that  diere  are  about  10' '  neurons  and  10^  connections  in  the  human 
brain  and  that  each  neuron  receives  input  from  about  10^-  10“l  other  neurons.  These  numbers  are 
quite  large,  but  not  so  large  as  to  present  no  problems  for  connecuomst  theories.  It  is  also  important 
to  remember  that  neurons  are  not  switching  devices;  the  same  signal  is  propagated  along  all  of  die 
outgoing  branches.  For  example,  suppose  some  model  called  for  a  separate,  dedicated  path  between 
all  possible  pairs  ol  units  m  two  layers  of  size  N.  It  is  easy  to  show  that  this  requires  N“ 
intermediate  sites.  This  means,  for  example,  that  diere  are  not  enough  neurons  in  the  brain  to 
provide  such  a  cross-bar  switch  for  layers  of  a  million  elements  each.  Similarly,  there  are  not 
enough  neurons  to  provide  one  to  represent  each  complex  concept  at  every  position,  orientation, 
and  scale  of  visual  space.  Although  the  development  of  connecuomst  models  is  in  its  perinatal 
period,  we  have  been  able  to  accumulate  a  number  of  ideas  on  how  some  of  the  required 
computations  can  be  carried  out  without  excessive  resource  requirements.  I  wo  of  die  most 
important  of  these  are  described  below.  A  third  important  idea  is  dial  of  sequencing,  but  that  will 
be  deferred  to  SecUon  5  in  order  to  develop  it  in  die  context  of  a  detailed  example  from  vision. 

Fixed  Resolution  Computation 

In  die  multiplication  example  of  Figure  10  it  might  seem  that  NxNy  units  are  required  to 

implement  this  simple  fimeuon  and  that  in  general  the  number  of  units  would  grow  exponentially 
with  the  number  of  arguments.  However,  there  are  several  refinements  which  can  drastically  reduce 
die  number  of  required  units.  1  he  principal  way  to  do  dus  is  to  fix  the  number  of  units  ai  the 
resolution  required  for  the  compuuiuon.  Figure  IS  shows  the  network  of  Figure  10  modified  when 
less  computational  accuracy  is  required. 
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Figure  18:  Modified  Table  Using  Less  Units. 


When  the  number  of  variables  in  the  function  becomes  large,  it  might  seem  that  the  fan-in  or 
number  of  input  connections  might  become  unrealistically  large.  For  example,  with  the  function  i. 
—  f(u.v,w,x,y,z)  implemented  with  100  values  of  z,  when  each  of  its  arguments  can  have  100  distinct 

values,  would  lequire  an  average  number  of  inputs  per  unit  of  lo'^/lO^,  or  10*^1  However,  there 
are  simple  ways  of  trading  units  for  connections.  One  is  to  replicate  die  number  of  units  with  each 
value.  This  is  a  good  solution  when  die  inputs  can  be  paruuoned  in  some  natural  way  as  in  die 
vision  examples  in  the  next  section.  Aaodier  is  to  use  intermediate  units  when  the  compulation  can 
be  decomposed  in  some  way.  For  example,  if  fiu.v.w.x.y.z)  =  g(u,v)o  h(w,x,y,z),  where  o  is  some 
composiuun,  Uien  separate  tables  for  f(g,h),  g(u.v),  and  h(w,x,y,/)  can  be  used.  The  outputs  from 
die  g  and  h  tables  can  be  combined  m  conjunctive  connecuons  according  to  the  composition 
operator  o  via  a  third  table  to  produce  f.  In  perception  the  transition  from  u.v.w.x.y.z  to  g.h  to  f 
corresponds  to  changes  in  level  of  abstraction. 

Low  Resolution  Grain 

Suppose  we  have  a  set  of  units  to  represent  a  vector  parameter  v  composed  of  components  (r.s). 
Suppose  that  the  number  of  units  required  to  represent  the  stibspaee  r  is  Nf  and  that  required  to 

represent  s  is  Ns.  Then  the  number  of  units  required  to  represent  v  is  NrNs.  It  is  easy  to  construct 

examples  in  vision  where  the  product  NrNs  is  loo  close  to  the  upper  bound  of  10^  units  to  be 

realistic.  Consider  the  case  of  trihedral  veruces,  an  important  visual  cue.  Three  angles  and  two 
position  coordinates  are  necessary  to  uniquely  define  every  possible  trihedral  vertex.  If  we  use  5 

degree  angle  sensitivity  and  10^  spaual  sample  points,  the  number  of  units  is  given  by  Nr  ~  5xl(P 

C  O 

and  Ns  -  10-  or  5x10  !  How  can  we  achieve  die  required  representauon  accuracy  with  less  units? 

In  many  instances,  we  can  take  advantage  of  die  fact  that  the  uctual  occurrence  of  parameters  is 
/.nv  density.  What  we  mean  by  this  in  terms  of  trihedral  vertices  is  that  in  an  image,  such  vertices 
will  rarely  occur  in  tight  spatial  clusters.  (If  they  do,  one  cannot  resolve  diem  as  individuals 
simultaneously.)  However,  even  though  simultaneous  proximal  values  of  parameters  are  unlikely, 
they  still  can  be  represented  accurately  for  other  compulations. 

Hie  solution  is  to  decompose  the  space  ( r.s)  into  two  subspaces,  each  widi  unilaterally  reduced 
resolution. 

Instead  of  NrNs  units,  we  represent  v  with  two  spaces,  one  with  Nf.Ns  units 
where  Nf.«Nr  and  anodier  with  NrNb.  units  where  NS.«NS, 

To  illustrate  this  technique  with  the  example  of  trihedral  veruces  we  choose  N'i .  =  0.01  Ns  and 
Nf.  =  0.0 1  Nf.  Thus  the  dimensions  of  the  two  sets  of  units  arc: 

Ns,Nr  =  5xl06 
and 

NbNf.  =  5xl06. 

The  choices  mean  that  we  have  one  type  of  unit  which  accurately  represents  die  angle 
measurements  and  fires  for  any  trihedral  vertex  in  a  given  visual  region,  and  anodier  set  of  units 
which  fire  only  if  a  vertex  is  present  at  the  precise  position.  Figure  IV  shows  the  two  cases. 

Figure  19:  Fuzzy  Resoluuon  Trick. 

If  die  vertex  enters  into  anodier  relation,  say  R(v,a),  wheie  both  us  angle  and  posiUon  are 
required  accurately,  one  simply  conjuncuvely  connects  pairs  of  appropriate  units  from  each  of  the 
reduced  resoluuon  spaces  to  appropriate  a-units.  The  conjunctive  connection  represents  the 
intersection  of  each  of  Us  components'  fields. 
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'/'lux  resolution  device  is  a  variant  of  a  general  result  due  to  {Minion.  J9H0);  namely,  that 
connections  from  overlapping  sets  of  units  can  produce  fine  resolution  with  less  units.  An  important 
limitation  of  this  technique,  however,  is  that  the  input  must  be  sparse.  If  inputs  are  too  closely 
spaced,  "ghost”  firings  will  occur  (figure  20).  Another  point  is  that  the  resolution  device  is 
essentially  a  units/connections  tradeoff,  but  as  the  brain  has  many  more  synapses  than  neurons,  the 
tradeoff  is  attractive. 


Figure  20:  "Ghost"  Firings. 


5.  Some  Implications  for  Turly  Visual  Processing 

Although  early  visual  processing  appears  to  be  particularly  well-suited  for  connections 
treatment,  there  arc  a  number  of  serious  problems.  Some  of  these  arise  from  the  immense  si/e  of 
the  cross  product  of  die  spatial  dimensions  with  those  of  other  interesting  features  such  as  color, 
velocity,  and  texture,  thus  to  explain  how  tmage-like  input  such  as  color  and  opucal  flow  are 
related  to  abstract  objeas  such  as  "a  blue,  fast-moving  thing,"  it  becomes  necessary  to  use  all  the 
techniques  of  die  previous  secuons.  We  will  work  through  an  example  in  detail  and  show  a  soluuon 
which  uses  realisuc  numbers  of  units,  connecuons,  and  connections  per  unit.  The  main  trap  to  avoid 

is  a  soluuon  that  requires  XI  units  where  X  is  die  spaual  dimension,  f  is  the  number  of 
measurement  values  per  modality  (color,  distance,  velocity,  etc.),  and  k  is  the  number  of  modalities. 

The  above  example  omits  the  details  of  the  transformation  involved  m  relaung  image-like 
features,  like  primary  color  measurements,  to  a  percept,  like  "blue."  To  remedy  this  defficiency,  we 
will  work  through  a  second  example,  the  calculation  of  shape  from  shading,  which  emphasizes  tins 
kind  of  transformauon. 

Objects 

The  visual  field  contains  objects  that  are  disjoint.  This  separateness  is  manifest  in  groups  of 
spaually  registered  features  such  as  texture  and  color  which  distinguish  the  objects.  Thus  we  regard 
die  problem  of  detecung  an  object  as  a  matter  of  determining  which  of  several  possible  features  of 
color,  texture,  mouon,  and  shape  it  has.  In  fact  we  can  view  these  features  as  having  ranges  of 
associated  parameter  values.  For  example,  an  object  could  be  described  as  having  the  properties 
"fast"  and  "blue."  The  quotes  signify  that  the  property  is  not  a  single  value  but  incorporates  an 
appropriate  range  of  values,  lhe  property  "blue"  might  be  any  of  a  set  of  primary  red,  green,  and 
blue  values,  each  of  which  satisfies  some  relationship  M(r,g,b)  which  defines  the  percept  "blue."  In 
a  one-dimensional  reuna,  we  might  imagine  arrays  of  spatially  registered  color  sensors,  each 

appropriately  connected  to  a  property  unit,  as  shown  in  Figure  2!. 

Figure  21:  Feature  Measurement: 

A  Unit  which  Responds  to  a  Range  of  Blue. 

Figure  21  shows  two  kinds  of  units:  a  property  unit  and  spatially-registered  sensor  units.  The  sensor 
units  represent  live  different  values  of  each  of  three  parts  of  the  color  spectrum.  In  our  primiuve 
design,  a  properly  unit  has  a  high  potenua!  value  (  =  1)  if  any  of  the  spaual  sensor  units  lor  that 
measurement  value  have  a  high  potential,  i.e., 

Blue  <-  A N I )( X )  Blue(X) 

l'he  non-spatially  registered  units  represent  ranges  of  feature  values,  e.g.  the  property  "blue." 
Objects  can  be  described  in  terms  of  combinations  of  these  properties.  The  property  units,  in  turn, 

receive  inputs  from  groups  of  spatially  registered  primary  color  unit  measurements.  Here  we  lake- 

advantage  of  the  disjunctive  nature  of  different  groups  of  inputs  to  differentiate  between  different 
parts  of  space.  The  number  of  connections  to  a  property  unit  can  always  be  reduced  by  replicaung 
tire  property  unit  and  conneeung  the  replicated  unit’s  outputs  to  a  single  property  unit. 

Tuning 

In  Figure  21  the  "blue"  unit  will  respond  to  any  values  of  primary  inputs  in  the  appropriate 
ranges  of  "blue."  Hus  can  make  it  suscepuble  to  noise;  for  example,  consider  the  case  where  the 
object  has  some  specific  value  of  "blue"  in  the  appropriate  range  and  there  arc-  also  random  similar 
values  of  "blue"  at  odier  image  points.  One  way  of  ruling  out  these  extraneous  values  is  to  tunc  the 
"blue"  unit  to  respond  to  only  the  appropriate  set  of  primary  color  measurements.  Hus  is  done  by 
using  a  fine-grained  set  of  perceptual  color  units.  Within  tins  set  there  are  many  units 
corresponding  m  colors  in  the  range  defined  by  "blue"  (although  only  one  is  drawn  in  Figure  22). 

1  lining  the  "blue"  unit  is  accomplished  by  conjunctively  connecting  the  appropriate  color  units  to 
their  corresponding  values  of  primary  measurements  at  the  "blue"  unit  inputs.  I  his  is  shown  in 
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Figure  22.  The  fine-tuned  blue  unit  receives  input  from  all  parts  of  space  and  sums  us  input.  By 
firing  only  the  appropriate  line-grain  color  unit,  the  "blue”  unit  is  made  to  respond  to  only  tiro 
corresponding  set  of  us  activated  inputs.  Note  that  this  is  an  instance  of  the  general  tuning  method 
(discussed  in  Section  4). 

Figure  22:  Tuning  the  "Blue"  Unit  with  a  Precise  Value  Unit. 

A  problem  arisco  in  die  simple  circuit  of  Figure  21  when  the  visual  input  contains  more  than 
one  object,  dial  is,  more  than  one  group  of  spatially  registered  features.  The  simple  network  of 
Figure  22  cannot  detect  the  spatial  distinctness  of  the  two  groups.  To  make  this  problem  more 
concrete,  let  us  consider  two  spatially  distinct  items,  one  blue  (B)  and  fast  (F)  and  die  other  red  (K) 
and  textured  (T).  In  the  simple  network  we  must  expect  all  feature  linns  have  a  high  polenual,  i.e„ 
B,  T,  R,  and  F,  and  there  is  no  grouping  of  the  two  appropriate  pairs,  BF  and  KT.  This  is  an 
instance  of  the  general  problem  of  multi-attribute  concepts  which  has  been  viewed  as  a  major 
obstacle  to  connecliomsl  schemes. 

One  soluuon  to  dus  is  to  elaborate  all  BF(x)  units,  but  this  poses  two  problems.  First,  there 
are  a  large  number  of  units,  i.e.,  (Nm)^Nx  where  Nm  is  the  number  of  feature  groups  and  Nx  is 
die  spatial  quantification.  For  die  retina  (even  the  fovea)  this  number  becomes  unrealistically  large. 
A  soluuon  is  to  allow  pairs  of  coarse-grained  properly  units  which  sull  do  not  use  the  spatial 
registration  explicitly.  In  Figure  23  we  show  the  circuit  for  a  BF'  cell  which  assumes  a  high  potential 
only  if  its  inputs  from  sensor  units  are  spatially  registered.  This  is  done  by  making  spatially 
registered  units  have  conjuncuve  connecuons.  That  is,  appropriate  values  of  color  at  x  =  7  and 
velocity  at  x  =  7  would  make  a  conjunctive  connecuon,  but  appropriate  values  diat  were  not 
spatially  registered  would  not.  Of  course,  the  BF  unit  might  have  to  normalize  its  input  in  the 
manner  of  Section  2,  if  there  were  many  visual  features  present.  Note  that  the  velocity  measurement 
portion  of  the  network  is  not  drawn  but  that  it  is  nearly  idenucal.  Velocity  sensors  are  connected  to 
line-grained  velocity  units  in  the  same  way  as  color  sensors  are  connccicd  to  fine-grained  color 
units.  Models  for  the  various  parts  of  velocity-sensiuve  networks  have  been  explored  by  |I  lorn  and 
Schunck,  1980;  Barnard  and  Thompson,  1979;  and  Ballard,  1981c|. 

Figure  23:  A  BF  Ural  Detects  the  Fxistence 
of  Spatially  Registered  "Blue"  and  "Fast"  Percepts. 

Despite  all  these  ways  of  economizing,  it  is  sull  combmatorically  implausible  to  have  complex 
cells  such  as  BTRM(...).  However,  there  is  a  way  around  this  problem  using  multiple  connections 
from  object  mills.  A  blue-textured-fast  ( BIT-)  object  unit  can  be  synthesized  from  III'  and  BF  units, 
i.e..  BT  &  BF  ~>  BTF.  What  die  ~>  symbol  means  is  that  the  implication  is  not  guaranteed,  but 
very  hktrlv,  given  that  die  BF  and  BT  units  are  tuned.  The  BF  and  BT  units  detect  spaual 
registrauon  directly  via  connecuons  like  those  in  Figure  23,  but  die  BTF  unit  does  not.  We  arc- 
saying  essentially  that,  in  general,  simple  (here,  pairwise)  eonjuncuons  can  be  kept  spaUally 
registered  by  conjuncuve  connections,  and  dial  more  complex  property  combinations  can  be 
synthesized  from  them.  Complex  combinations  that  are  important  to  an  individual  arc*  presumed  to 
have  new  units  recruited  |Feldman,  1980;  1981)  to  represent  diem  explicitly. 

.Sequencing 

One  might  imagine  that  die  network  in  Figure  23  is  adequate  and  that  given  visual  input,  it 
will  converge  to  appropriate  potential  values.  However,  there  is  an  easy  way  to  see  that,  in  general, 
dus  will  not  happen  for  all  percepts.  First  consider  the  line-grained  set  of  feature  values  in  Figure 
22.  Since  units  in  this  network  receive  inputs  from  all  pans  of  space,  diffusely  valued  spaual  points 
ean  easily  obscure  a  set  o(  contiguous  spatial  points  with  a  single  value.  For  example,  mouon  from 
odier  parts  of  space  can  obscure  the  mouon  of  a  parucular  object  (in  feature  space).  Another  way  to 
understand  this  is  to  suppose  a  set  a  features  formed  a  single,  multi-dimensional  space.  In  this  space 
objects  are  clear,  but  because  of  the  enormous  size  of  the  space,  it  must  be  represented  widi 
different  projections.  The  piojections  are  the  familiar  subspaces  of  color,  velocity,  etc. 


There  is  a  way  to  use  the  si 1 1 >. spaces  to  cause  the  appropriate  percepts  to  fire  by  building 
conjunctions  of  features  in  a  sequential  manner.  Tins  circumvents  the  basic  problem  described 
above.  To  develop  the  sequential  solution,  we  introduce  spatial  units,  such  as  those  siiown  in  Figure 
24.  There  is  a  spatial  unit  for  each  spatial  position  and  each  intrinsic  parameter  (e.g.  color). 

Figure  24:  A  Blue-Fast  Unit  showing 
Spaual  Context  Units. 

Spatial  units  receive  input  from  percept  units,  feature  units,  and  sensors.  If  all  three  of  these 
are  present,  the  unit  will  adjust  the  potentials  of  all  of  the  sensor  units  upwards.  We  assume  that 
upper  and  lower  bound  units  will  adjust  the  potentials  of  the  entire  sensor  network.  I  he  net  result 
will  be  that  spatial  sensor  units  not  receiving  appropriate  input  will  have  no  effect  on  the  property 
units. 

With  respect  to  Figure  24,  we  suggest  the  following  scenario  for  a  "blue,"  "fast-moving," 
"horizontal"  surface,  where  blue  stands  out  in  the  color  feature  space,  but  last-moving  and 
horizontal  do  not  in  their  respective  feature  spaces.  First,  the  blue  percept  unii  causes  input  from 
blue  valued  spatial  positions  to  be  favored.  Under  this  restriction,  one  of  the  other  features  is  now 
distinct,  further  raising  the  potential  of  active  sensor  cells  at  those  positions.  The  effect  is  as  if  a 
blue  filter  were  placed  in  front  of  the  sensory  input.  Now  the  third  feature  is  detected.  At  this  point 
die  composite  network  indicates  that  there  is  a  blue,  fast-moving,  horizontal  surface  in  the  visual 
field  at  the  position  specified  by  high-eonfidenee  spaual  units. 

Our  solution  to  die  problem  of  detecting  spaually- registered  features  is  not  unique  but  does 
require  much  less  than  Xf*4  units.  Table  2  summarizes  the  connection  and  unit  requirements  in 
terms  of  spatial  complexity  esumales. 

Table  2 

k  =  number  of  modalities 
X  =  number  of  distinct  spatial  values 
F  -  number  of  distinct  "fuzzy"  feat  tires/ modality 
f  =  number  of  distinct  fine-resolution  features/modality 


IJ nil  Type 

No.  of 

Units 

Connections/ 

Unit 

Connects 

Units  of 

Sensors  (S) 

kXf 

F 

ff,  C 

Fine- features  (If) 

kf 

X 

I  F.  C 

F'uzzy-features  (IT) 

F~ 

X(f/F)2 

C 

Spaual  (C) 

kXF 

F2 

S,  ff 

Motor  Control  of  the  I  ye 

lo  see  how  dus  notion  of  distributed  objects  might  work  in  motor  control,  we  offer  a  simplistic- 
model  ol  vergence  eye  movements.  (The  same  idea  may  be  valid  for  fixations,  but  control  probably 
takes  place  at  higher  levels  of  abstraction.)  In  this  model  reunotopic  (spaual)  units  are  connected 
directly  to  musUe  contiol  units.  Fach  reunotopic  unit  can  if  saturated  cause  die  appropriate 
contraction  so  that  the  new  eye  position  :s  centered  on  that  unit.  When  several  reunotopic  units 
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saturate,  each  enables  a  muscle  control  unit  independently  and  llie  muscle  itself  contracts  an  average 
amount. 

Figure  25  shows  the  idea  for  a  one-dimensional  reiina.  For  example,  with  units  at  positions  2, 
4,  5,  and  6  saturated,  the  net  result  is  dtat  the  muscle  is  centered  at  17/4  or  4.25.  This  idea  can  be 
extended  if  we  assume  the  reunotopic  units  have  overlapping  fields  such  as  those  used  by  |!lmton, 
19X0|.  this  kind  of  organization  is  consistent  with  studies  of  the  organization  of  tire  superior 
colliculus  in  die  monkey  |Wuriz  and  Albano,  1980|. 

Figure  25:  Distributed  Control  of  Fye  luxations. 

Nonce  that  each  reunotopic  unit  is  capable  of  enabling  different  muscle  control  units.  The 
appropriate  one  is  determined  by  die  enabled  x-origin  unit  which  inhibits  commands  to  die 
inappropriate  control  units  via  modifiers. 

One  problem  with  this  simple  network  arises  when  disparate  groups  of  reunotopic  units  are 
saturated.  The  present  eonfigurauon  can  send  the  eye  to  an  average  posidon  if  the  features  are  truly 
identical.  Also,  die  network  can  be  modified  with  addiuonal  connecuonx  so  dial  only  a  single 
connected  component  of  saturated  times  is  enabled  by  using  additional  object  primitives.  A  version 
of  this  motor  control  idea  has  already  been  used  in  a  computer  mode!  of  the  frog  tectum  |l)idday, 
1 976). 

There  are  still  many  details  to  be  worked  out  before  this  could  be  considered  a  realistic  model 
of  vergence  control,  but  it  does  illustrate  die  basic  idea:  local  spatially  separate  sensors  have  distinct, 
active  connections  which  could  be  averaged  at  die  muscle  for  fine  motor  control. 

Shape  from  Shading 

In  a  previous  sub-secuun  we  showed  how  spatially  distributer!  tnformauon  could  be  connected 
to  a  global  object  unit,  there  the  issue  was  primarily  one  of  feasibility.  With  a  simple  model,  diere 
were  large  numbers  of  global  features,  yet  it  was  possible  to  detect  diem  all.  Relauonxhips  between 
image-like  inputs  and  features  were  assumed  but  not  stressed.  In  this  section  we  piesent  an  example 
of  such  iclatioiis  by  showing  a  network  which  computes  surface  orientauon  from  an  intensity  array. 

The  specific  example  we  will  use  is  dial  of  shape-from-shnding.  It  is  well-known  that  given  die 
orientation  of  a  surface  with  respect  to  a  viewer,  its  reflectance  properues  and  die  locauon  of  a 
single  light  source,  dial  die  brightness  at  a  point  of  the  viewer's  retina  can  be  determined.  That  is, 
die  reflectance  function  R((),‘I>,0,.,‘I>  ),  where  (),<l»  and  Oy'l>s  are  orientations  ol  die  surface  and 

source  respeeuvely.  allows  us  to  determine  l(x,y),  the  normalized  intensity  in  terms  of  retinal 
coordinates.  However,  die  perceptual  problem  is  the  reverse:  given  I(x,y)  and  R( ... ).  determine 
0( x, y ),*!>(  x.y)  and  ()v<t>s. 

In  general,  die  problem  of  deriving  0(x,y),'I>(x,y)  and  0  is  underdelermmed.  However, 
lkeuchi  |I9X0|  showed  dial  the  surface  could  be  determined  locally  once  0^,4^  was  specified.  This 
method  has  been  extended  (Mallard,  19S lb|  to  the  case  where  Oy'hj,  is  miUally  unknown. 

The  alguridim  is  outlined  as  follows.  For  a  single  light  source,  the  intensity  at  a  point  on  a 
reuna  can  lie  described  m  terms  of  the  orientation  of  the  normal  of  the  corresponding  surface  point 
and  the  surface  orientation.  Thai  is,  in  spherical  oordinaics, 

!(x,y)  =  KlOAtyty  (F!i|.  5.1) 

where  the  angles  ()  and  <t»  are  functions  of  x  and  y.  The  viewer  is  assumed  to  be  looking  down  the 
z  axis  (towards  its  origin)  under  ordionormal  viewing  conditions.  O,  >t>,  0,.,  and  4>s  are  spherical 

angles  measured  widi  respect  to  dus  frame.  Now  by  minimizing  (!-R)^  and  appending  a 
smoothness  constraint  on  O  and  4>  we  have  j lkeuchi,  !9XU|  an  expression  for  the  local  error  (if  the 
estimale  for  O  and  4>  is  unreliable)  as  follows: 
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li(  x.y )  =  (l-R)2  +  A((V2())2-HV2<P)2) 

where  A  is  a  Lagrange  multiplier.  I  •or  a  minimum,  Lq  anti  L,j,  =  0.  Skipping  some  steps,  tins  leads 
to 

0(x,y)  =  Ouvc(x,y)  +  T(x,y)R0 
‘•‘(x.y)  =  ,i)avc(x.y)+ r(x.y)Rtt) 

where 

<t»ave(x'y^  IS  a  '()ca*  average 
and 

T(x,y)  =  ( 1  /  16A)( I  - R) 

In  solving  these  eciuauons,  ()s  and  <1^  are  assumed  to  be  known.  Ikeuchi  used  a  parallel-iterauve 
method  where  (lie  <I’avc  and  0  are  calculated  from  a  previous  iteration. 

To  calculate  Os  and  *t>s,  we  assume  O  and  <l>  are  known  and  use  a  Hough  technique.  First  we 
form  an  array  A|()s,‘t>s|  of  possible  values  of  Os  and  <t>s  initialized  to  zero.  Now  we  can  solve  the 
Lambertian  reflectance  equation  for  <l>  The  Hough  technique  works  as  follows,  l-'or  each  surface 
element  and  for  each  Os  we  calculate  all  <£,.  dial  sausfy  F.q.  5.1  and  increment  A(Os,«t>s|,  i.e., 
A|Os«l>sl  ;  =  A|Os,‘l>s|  +  l.  After  all  surface  elements  have  been  processed,  the  maximum  value  of  A 
corresponds  to  the  location  of  die  point  source.  In  [Mallard,  1981  b)  it  is  shown  that  calculauon  of 
the  source  locauon  can  proceed  in  parallel  with  dial  of  0( x.y)  and  *l>(x,y)  and  that  the  two 
calculations  will  converge. 

Iriiplciiicntution  in  Networks 

The  above  description  of  the  shape- from-xhading  is  geared  to  implementation  on  a  conventional 
computer.  We  now  describe  how  diese  compulauons  can  be  realized  with  connections  between 
networks  of  basic  units.  1  he  general  strategy  is  as  follows.  Variables  in  the  above  equations  are 
represented  by  networks  of  units  where  each  unit  has  a  discrete  value.  The  connections  between 
value-units  must  be  made  m  such  a  way  that  the  networks  converge  to  a  set  of  value-units  that  have 
potential  equal  to  unity.  These  units  well  represent  a  parucular  solution  for  die  input  intensity 
distribiiuon. 

The  shape-from  shading  calculations  can  be  decomposed  into  two  principal  networks.  One 
represents  the  sm lace  orientations  at  retinal  points  and  die  oilier  represents  possible  illumination 
angles.  Huts  there  is  a  )0(x,y),<t>(x,y)}-network  and  a  [(Oj.dP,.)}  network.  In  addition,  input  values 

of  I(x,y)  are  assumed.  The  ( )( x.y)  sub-network  is  represented  by  units  each  reptcsenling  a  specific 
value  of  0  at  a  parucular  point  x.y.  This  representation  requires  N^N^Ny  units.  Assuming  Nx  = 

Ny  a*  22  and  Ny  =  2"\  the  requirement  is  for  2*^  or  <  10^  units.  The  illumination  angle  network 
uses  units  to  repiesent  pairs  of  values,  one  for  Os  and  one  for  «l>s.  The  reason  for  this  choice  will 
be  discussed  momentarily. 

With  diese  provisos  we  describe  the  connections  between  networks  that  compute  shape  from 
shading  in  two  parts.  The  first  part  describes  connections  from  the  (0,'t>t  nelwntk  to  the  network 
that  detects  illumination  direction,  l'he  second  part  describes  die  connections  in  the  other  ditccuon. 
For  every  posiuon  x,y  and  lor  each  value  of  0,  <1*.  and  I  at  that  position,  the  appiopriale  values  of 
<>s  and  <1^  which  sausfy  R-I  can  be  precalculated.  1'hus  0,«I>,1  triples,  each  representing  a  specific 
value,  make  conjtincuvc  connecuons  with  Oy<t>s  units.  Figure  26  shows  a  represen tauve  connection. 


(iiq.  5.2a) 
(Fq.  5.2b) 


Figure  26:  A  I’ortion  of  die  Connections 
Used  to  I Meet  Illumination  Angle. 
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The  Os.‘l>s  anus  are  summation  units.  F-ach  Oy<t>s  unit  sums  the  number  of  i,(),<l>  input  triples 

that  are  firing  and  their  potentials  are  proporuonal  to  tins  sum.  The  proportionality  constant  may  be 
known  from  physical  considerations  or  may  be  adjusted  by  upper  and  lower  bound  units  like  those 
described  in  Section  2.  The  Os,«l»s  unit  with  the  highest  potential  is  the  one  that  is  consistent  with 

the  maximum  number  of  |  (),*!>}  and  I  units. 

Two  important  considerations  effect  the  design  of  the  ((Os‘t»s)}  network.  One  is  the  need  for 
good  discrimination  in  the  values  of  Os  and  This  led  to  the  decision  to  use  (Os,‘l»s)  pairs  as 
units  instead  of  Os  and  «t>s  units.  In  the  latter  case,  solutions  to  lire  constraint  relation  that  satisfied 
pairs  of  (()  «h  )  values  may  be  obscured  in  lire  individual  ()s  and  ‘t>s  networks  owing  to  the 
reduced  dimensionality.  The  second  consideration  is  the  average  number  of  connections  per  unit. 
With  26  values  for  Os  and  <1>y  there  must  be  212  units  in  the  |(Os't>s)}  network.  An  upper  bound 
on  the  number  of  connections  to  this  network  from  the  {(0<t>)}  network  can  be  determined  from 
straightforward  cotinung  arguments.  At  each  point  x,y  there  are  NqN^Nj  combinations,  or  2^. 

Willi  NxNy  =  2*4,  tins  leads  to  229  or  109  total  connections.  Thus  the  average  number  of 

connections  per  (0  ‘t>s)  unit  is  229/212  or  217  =  3x!05-  If  this  is  unreasonably  large,  a  simple 
solution  is  to  use  auxiliary  Os‘t>s  units,  which  sum  subsets  of  the  inputs  to  a  Os<t»b  unit.  The  Os<l»s 
unit  then  sums  die  outputs  of  die  auxiliary  units. 

For  die  second  part  we  consider  connections  from  the  network  to  the  !0.‘t>}  network 

needed  to  realize  die  constraint  of  liquations  5.2a  and  b.  We  will  only  consider  the  first  equation 
(since  die  second  is  treated  similarly).  This  represents  a  constraint  g(<l,,Oavc.lI().()y<l>s)  =  0.  Given 
values  for  these  variables  we  can  determine  R(0.<h.Os,‘t»s)  and  Rq  to  see  whether  or  not  liquation 
5.2a  is  satisfied.  Thus  a  straightforward  application  of  our  technique  would  use  conjunctive 
connections  tn  groups  of  nine,  as  shown  in  Figure  27.  For  a  particular  0  value  we  examine  all  the 
combi  nations  of  values  for  the  other  variables  and  connect  the  subset  that  satisfies  liquation  5.2a  to 
die  O-unil. 


Figure  27:  Alternate  Connection  Schemes 
for  Computing  Surface  Orientation. 

Here  we  have  used  four  nearby  values  of  0  to  compute  Oave.  This  implementation  is  unsatisfactory 
for  two  reasons;  (1)  the  large  number  of  inputs  in  the  conjuncuve  connecUons  would  be  noise- 
sensitive;  and  (2)  there  would  be  an  unrealistically  large  number  of  connections  on  any  one  unit.  To 
solve  both  of  these  problems  we  use  ()ave  units  for  each  0(x,y).  While  this  only  doubles  die 
number  of  units  in  die  {(),<(>}  network,  it  drastically  reduces  the  number  of  connections  to  an 
individual  O-unil.  Assuming  our  earlier  figures,  this  number  is 

N0N0aveNrN<t>N0sN<t>s  ™  ^  =  J°9- 

This  number  is  suit  very  large,  but  can  be  further  reduced  by  further  unit/connection 
tradeoffs. 

The  introduction  of  the  ()avt,  unit  to  reduce  connecuons  represents  a  different  kind  of  tradeoff 
from  the  simpler  tradeoff  used  to  handle  high  density  connections  in  the  illumination  angle 
network.  A  specific  value  of  0.)VC  may  be  produced  by  several  different  combinations  of  nearby 
O’s.  leach  of  these  groups  of  O's  makes  a  conjunctive  connection  with  the  0.1V|_.  unit.  However, 
since  we  expect  a  unique  value  of  0(x,y),  the  unit  behaves  differently  than  dial  in  the  illumination 
angle  network.  Rather  than  sum  its  inputs,  each  Oavt.  unit  adjusts  its  potential  based  on  the 

maximum  of  us  conjuncuve  groups. 


Other  Networks  Determine  boundary  Conditions 

The  linn  outline  for  shape  from  shading  calculauons  does  not  include  a  discussion  of  boundary 
conditions.  These  can  be  calculated  from  other  networks  such  as  a  disparity  network.  I'or  example, 
tile  existence  of  a  depth  discontinuity  d(x.y)  in  the  disparity  nelwoik  could  inhibit  connections 
between  the  O  and  <t>  either  side  of  the  discontinuity.  In  general,  such  networks  will  interact  in 
many  different  ways  to  determine  boundary  condiUons  |llarrow  and  lenenbaum,  1978).  Much 
additional  work  needs  to  be  done  to  specify  diese  interacuons  more  precisely. 

Conclusions 

We  have  now  completed  five  years  of  intensive  effort  on  the  development  of  connecuonist 
models  and  their  applicauon  to  die  description  of  complex  tasks.  While  we  have  only  touched  the 
surlace,  die  results  to  dale  are  very  encouraging.  Somewhat  to  our  surprise,  we  have  yet  to 
encounter  a  challenge  to  the  basic  formulation.  Our  ntlcmpcs  to  model  in  detail  parucular 
computations  [Sabbah,  1981;  Hallard  and  Sabbah,  198!|  have  led  to  a  number  of  new  insights  (for 
us,  at  least)  into  these  specific  tasks.  Attempts  like  tins  one  to  formulate  and  solve  general 
computational  problems  in  realistic  connecuonist  terms  have  proven  to  be  difficult,  but  less  so  than 
we  would  have  guessed.  There  appear  to  be  a  number  of  interesting  technical  problems  within  die 
theory  and  a  wide  range  of  quesuons  about  brains  and  behavior  which  might  benefit  from  an 
approach  along  the  lines  suggested  in  this  report. 
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Appendix:  Summary  of  Definitions  and  Notation 


A  unit  is  a  computational  entity  comprising: 

{q )  -  a  set  of  discrete  stales,  <  10 

p  ••  a  continuous  value  in  |-1,1|,  called  potential  (accuracy  of  10  digits) 
v  --  an  output  value,  integers  0  <  v  <  9 

i  *-  a  vector  of  inputs  i| . in 

and  functions  from  old  to  new  values  of  these 

P  <-  W.P.q) 
q  <-  g(i.p.q) 
v  <-  h(i.p.q) 

which  we  assume,  for  now,  to  compute  continuously.  The  form  of  the  f,  g,  and  h  functions  will 
vary,  but  will  generally  be  restricted  to  conditionals  and  functions  found  on  hand  calculator. 

P-llnit 

For  some  applications,  we  will  use  a  parucularly 
proportional  to  us  potential  p  (rounded)  and  which 

p<-  P  +  P  !'l  <  p  <  II 

v  =  ap  -  0  |v  = 

where  /(,  a.  0  are  constants 

Connection  Tallies 

In  addition  to  graphical  notation,  the  outgoing  connections  to  other  units  can  be  described  in 
tabular  form.  Fach  outgoing  vj  (only  one  for  basic  units)  will  have  a  set  of  entries  of  the  form 

(< receiving  umt>.<indcx>,<±>,<type>) 

where  any  of  the  last  three  constructs  can  be  omitted  and  given  us  default  value.  The  <±>  field 
specifies  whether  die  link  is  excitatory  (  +  )  or  inhibitory  (-)  and  defaults  to  +.  The  <mdcx>  is  the 
input  index  j  in  q  at  die  receiving  end.  This  index  can  be  used  for  specifying  different  weights. 
Indexed  inputs  also  allow  for  functionally  different  use  of  various  inputs  and  many  of  our  examples 
exploit  dus  feature.  The  <type>  is  eidier  normal,  modifier  (m),  or  learning  (x),  die  default  being 
normal. 

Conjunctive  Connections 

In  terms  of  our  formalism,  this  could  be  described  in  a  variety  of  ways.  One  of  the  simplest  is 
to  define  the  potential  in  terms  of  the  maximum,  e.g„ 

P  <-  P  +  MaxOj  +  ij,  13  +  14,  15  +  1(3-17) 

The  max-of-sum  unit  is  the  continuous  analog  of  a  logical  OR-of-ANO  (disjunctive  normal 
form)  unit  and  we  will  sometimes  use  the  latter  as  an  approximate  version  of  the  former.  The  OR- 
of-ANI)  unit  corresponding  to  the  above  is: 

p  <-  p  +  a  OK  (i]&i2,  131&14,  i5&i6&(not  17)) 

VVinner-take-all  (WTA)  networks  have  the  property  that  only  the  unit  with  the  highest  potential 
(among  a  set  of  contenders)  will  have  output  above  zero  after  some  settling  time. 


simple  kind  of  unit  whose  output  v  is 
has  only  one  stale.  In  other  words 

0...9| 


A  coalition  will  be  called  stable  when  the  output  of  all  of  its  members  is  non -decreasing. 


References 


Alio,  A.,  J.  Ilopcroft.  and  J.  Ullman.  '/Vie  Design  and  Analysis  of  Computer  Algorithms.  Addison- 
Wesley,  1974. 

Anderson,  J.K.  and  G.K.  Mower.  Human  dissociative  Memory.  Washington,  DC:  V.ll.  Winston  and 
Sons,  1972. 

Arbi b.  M.A.  and  I).  Caplan,  "Neurolinguistics  must  be  computational,"  The  drain  and  llehaviorul 
Sciences  2,  449-483.  1979. 

Mallard,  M.H.,  "Parameter  networks,"  TR75,  Computer  Science  Dept,  U.  Rochester,  1981a. 

Mallard.  D.1I.,  "Shape  and  illumination  angle  from  shading,"  Computer  Science  Dept,  If  Rochester, 
1981b. 

Mallard,  13.1 1..  "3  d  rigid  body  mouon  from  opucal  How,"  Computer  Science  Dept,  U.  Rochester, 
1981c. 

Mallard,  D.ll.  and  D.  Sabi: ah,  "On  shapes."  Computer  Science  Dept.  U.  Rochester,  1981. 

Mamard.  S.T.  and  W.M  Thompson,  "Disparity  analysis  of  images."  IR  79-1,  Computer  Science 
Dept,  II.  Minnesota,  January  1979. 

Marrow,  II.G.  and  J.VL  Tenenbaum,  "Recovering  intrinsic  scene  characterisucs  from  images," 
Technical  Note  157,  Al  Center,  SRI  Inti,  April  1978. 

"The  Miain,"  Scientific  American,  September  1979. 

Muser,  I*. A.  and  A.  Roguel-Muser  ( lids).  Cerebral  Correlates  of  Conscious  Experience.  Amsterdam: 
North  Holland  Publishing  Co.,  1978. 

Collins,  A.. VI.  and  F.F.  l.oftus,  "A  spreading-acuvauon  theory  of  semantic  processing,"  Tsych  Review 
S2,  407-429,  November  1975. 

Midday,  R.I...  "A  model  of  visuomotor  mechanisms  in  the  frog  opuc  tectum,"  Math 7  Biosciences  JO, 
169-180,  1976. 

T.delnran,  G.  and  M.  Mountcastle.  The  Mindful  Bruin.  Hoston,  VIA:  MIT  Press,  1978. 

l  ahlntan,  S.F.,  "lire  Ilashnet  interconnection  scheme,"  Computer  Science  Dept,  Carnegie-Mellon 
U„  June  1980. 

Fahlnian,  SIC  NTT!..  A  System  for  Representing  and  Using  Real  Knowledge.  Moston,  MA:  MIT 
Press,  1979. 

Feldman,  J.A.,  "A  connectionisl  model  of  visual  memory,"  in  G.H.  Hinton  and  J. A.  Anderson  (fids). 

I ‘a  rail  el  Models  of  Associative  Memory.  Hillsdale,  NJ:  Lawrence  Hrlbaum  Associates,  Publishers, 
1981. 

Feldman,  J.A.,  "A  distributed  information  processing  model  of  visual  memory,"  FR 52,  Computer 
Science  Dept,  U.  Rochester,  1980. 

Friesen,  W.O.  and  G.S.  Stent,  "Neural  circuits  for  generating  rhythmic  movements,"  Ann  Rev 
Btophys  Btoengg  7,  37 -61,  1978. 

Grossberg,  S.,  "Biological  compeution:  Decision  rules,  pattern  formation,  and  oscillations,"  Troc 
Natl  Acad  Set  USA  77,  4,  2238-2342,  April  1980. 

Hanson,  A.K.  and  F.V1.  Riseman  (Fds).  Computer  Vision  Systems.  NY:  Academic  Press,  1978. 

Hinton,  G.F.,  "Relaxation  and  its  role  in  vision,"  Ph.D.  thesis,  Cogmuve  Studies  Program,  U. 
hdinbuigh,  December  1977, 

Hmton,  G.F.,  Draft  of  Technical  Report,  U.  California  at  San  Diego,  1980. 

Minton.  O  F.  am!  J.A.  Anderson  (Ids).  Parallel  Models  of  Associative  Memory.  Hillsdale,  NJ: 


Lawrence  Frlbaum  Associates,  Publishers.  1981. 

Horn,  B.K.P.  ami  U.G.  Sclmnck,  "Determining  optieal  flow,"  AI  Memo  572.  Al  Lab,  Mf'f',  April 
1980. 

Hubei,  D.I1.  and  T.N.  Wiesel,  "Hrain  mechanisms  of  vision,"  Scientijic  American,  150-162, 

September  1979. 

Hummel,  R.A.  and  S.W.  /ucker,  "On  the  foundauons  of  relaxauon  labeling  processes,"  TK-80-7, 
Computer  Vision  and  Graphics  Lab,  McGill  U..  July  1980. 

Ikeuchi,  K.,  "Numerical  shape  from  shading  and  occluding  contours  in  a  single  view,"  AI  Memo 
566,  A I  Lab,  Mil',  revised  February  1980. 

Kandel,  L.R .  The  Cellular  Haas  of  Behavior.  CA:  I’reeman,  1976. 

Kilmer,  W.L.,  W.S.  McCulloch,  and  J.  Blum,  "Some  mechanisms  for  a  theory  of  the  reticular 
formation,"  in  M.  Mesarovic  (Ld).  Systems  Theory  and  Biology.  NY:  Spnnger-Verlag.  1968. 

Kosslyn,  S.M.  and  S.P.  Schwartz,  “Visual  images  as  spatial  representations  in  active  memory,"  in 

A.R.  Hanson  and  L.M.  Rise-man  (lids).  Computer  Vision  Systems.  NY:  Academic  Press,  1978. 

Marr,  !).,  "Represenung  visual  information,"  in  A.R.  Hanson  and  L.M.  Rise-man  (lids).  Computer 
Visum  Systems.  NY:  Academic  Press,  1978. 

Marr,  D.C.  and  T.  l’oggio,  "Ctxiperauve  computation  of  stereo  disparity,"  Science  194,  283-287, 
1976. 

Minsky,  M..  "K-Lines:  A  Theory  of  Memory,"  Cognitive  Science  4,  2,  117  133,  1980. 

Minsky,  M.  and  S.  Papert.  1‘erceptrons.  Cambridge,  MA:  The  MIT  Press,  1972. 

Neisser,  Li  Cognition  ami  Kealitv:  Principles  and  Implications  of  Cognitive  Psychology.  San 
Francisco,  CA:  W.il.  Freeman  and  Co.,  1976. 

Perkel,  D.H.  and  IT  Mulkmey,  "Calibrating  compartmental  models  of  neurons,"  Am  J  Plivsiul  235, 
1.  R93-K9H.  1979. 

Prager,  J.M.,  "extracting  and  labeling  boundary  segments  in  natural  scene s,"  l  ETC  Trans  PAM  I  2, 
1,  16-27,  January  1980. 

Rosenfeld,  A.,  R.A.  Hummel,  ai  d  S.W.  Zucker,  "Scene  labelling  by  relaxation  operations,"  Il'.F.Ii 
Trans  SMC  0,  1976. 

Russell,  D.M.,  "Adaptive  modules  for  low-leve!  problem-solving,"  Computer  Science  Dept,  U. 
Rochester;  submitted  to  IJCAI,  1981. 

Sabbah,  I).,  "Design  of  a  highly  parallel  visual  recognition  system,"  Computer  Science  Dept,  U. 
Rochester;  submitted  to  IJCAI,  1981. 

Sejnowskt,  T.J.,  "Storing  covariance  with  nonlinearly  mieracung  neurons,"  J  Math  Biology  4,  4,  303- 
321,  1977. 

Stent,  G.S.,  W.B.  Kristan,  Jr.,  W.O.  Friesen,  C.A.  Ori,  M.  Poon,  and  R.L.  Calabrese,  "Neuronal 
generaUon  of  the  leech  swimming  movement,"  Science  2UU,  June  1978. 

Torioka,  I'.,  "Pattern  separability  in  a  random  neural  net  with  inhibitory  connections,"  Biol. 
Cybernetics  34,  53-62,  1979. 

Ullrnan,  S.,  "Relaxation  and  constrained  optimization  by  local  processes,"  CHIP  10,  115-125,  1979. 

von  der  Malsburg,  Ch.  and  D.J.  Wiiishaw,  "How  to  label  nerve  cells  so  that  they  can  interconnect 
in  an  ordered  fashion,"  Proc  Nall  Acad  Scl  USA  74,  11,  5176-5178,  November  1977. 

Warshaw,  II. S.  and  D.K.  Ilartline,  "Sumulauon  of  network  activity  m  stomniognstnc  ganglion  of  the 
spiny  lobster,  Panuhms,"  Brain  Research  lit),  259-272,  1976. 

Wuru,  R.II.  and  J.F.  Albano,  "Visual-motor  ftmcuon  of  the  primate  superior  colliculus,"  zlun  Rev 
Neuruirt  3,  189-226,  1980. 


Wickebsren  W  A  "Chunking  and  consolidation:  A  theoretical  synthesis  ul  semantic  neiwor  s. 
ilfiSnn*  m  condmonmg.  S-R  versus  cogmttve.  learning,  norma!  forgetung h  the  amnesic 
syndrome,  and  the  hippocampal  arousal  system,  l  sych  Review,  86,  1,  w, 

^.eki,  S.,  "The  representation  of  colours  in  die  cerebral  cortex,  Nature  28J.  April  19K0. 


ConnecfiomSrt  ° 


Ass****  s*»e  *”^'*3  , 

As>or>'s  md.W<W  conne  c.hon& 


^  co 


if  tSolcthe  cJ  : 


ore  2.  Se  If- irthtbi'ho'n  or\d  faeceuj 


f  receded  an  input  of  6  t/riiH  then  SI  per  b*oe  s/«p 

1  n  »*  n  **5"  •»  m  2  u 


t 

1  PW  PC  8} 

1 

6 

S' 

2 

5-S 

4 

3 

5-S 

3-ff 

4 

6 

3 

5  i 

6.5 

2 

6 

7-5 

1 

T 

9.5 

O 

5 

Sat 

o 

Figure  3. 


Figure  1%  .  O\ross>  is  green  connection  /nocfi'fieJ  bg  Cafif, 


,® 


0 

$ 

9 


II 


Blue 


different 
in  hen  si  hies 
of 

f',  J  *nd  k> 


I 


v 

\ooo 

Im 

W 

t  g  b 
X=  7 


00()J 

000 

igoo 

.old. 


X  =  8 


Color  Sensor  Units 

Figure.  Z1 .  Feature  measurement  :  ou  unit 

u)htc>h  responds  “ho  a  ronae  of  blue. 


•*  • 


Figure  2 4a 


S  C*,ii+ i J 

&(  K+hi  )  ‘ 
SC x-l  »l)- 
I  — 

SCx,tf')  — 

— 

e* - 

— 


©  (*,y+0 

ec xt^-0 

©Cx+>i'l'> 
OCk-i  »*f) 


e 

<pt 


t 


